Search Results: "vince"

30 August 2022

John Goerzen: The PC & Internet Revolution in Rural America

Inspired by several others (such as Alex Schroeder s post and Szcze uja s prompt), as well as a desire to get this down for my kids, I figure it s time to write a bit about living through the PC and Internet revolution where I did: outside a tiny town in rural Kansas. And, as I ve been back in that same area for the past 15 years, I reflect some on the challenges that continue to play out. Although the stories from the others were primarily about getting online, I want to start by setting some background. Those of you that didn t grow up in the same era as I did probably never realized that a typical business PC setup might cost $10,000 in today s dollars, for instance. So let me start with the background.

Nothing was easy This story begins in the 1980s. Somewhere around my Kindergarten year of school, around 1985, my parents bought a TRS-80 Color Computer 2 (aka CoCo II). It had 64K of RAM and used a TV for display and sound. This got you the computer. It didn t get you any disk drive or anything, no joysticks (required by a number of games). So whenever the system powered down, or it hung and you had to power cycle it a frequent event you d lose whatever you were doing and would have to re-enter the program, literally by typing it in. The floppy drive for the CoCo II cost more than the computer, and it was quite common for people to buy the computer first and then the floppy drive later when they d saved up the money for that. I particularly want to mention that computers then didn t come with a modem. What would be like buying a laptop or a tablet without wifi today. A modem, which I ll talk about in a bit, was another expensive accessory. To cobble together a system in the 80s that was capable of talking to others with persistent storage (floppy, or hard drive), screen, keyboard, and modem would be quite expensive. Adjusted for inflation, if you re talking a PC-style device (a clone of the IBM PC that ran DOS), this would easily be more expensive than the Macbook Pros of today. Few people back in the 80s had a computer at home. And the portion of those that had even the capability to get online in a meaningful way was even smaller. Eventually my parents bought a PC clone with 640K RAM and dual floppy drives. This was primarily used for my mom s work, but I did my best to take it over whenever possible. It ran DOS and, despite its monochrome screen, was generally a more capable machine than the CoCo II. For instance, it supported lowercase. (I m not even kidding; the CoCo II pretty much didn t.) A while later, they purchased a 32MB hard drive for it what luxury! Just getting a machine to work wasn t easy. Say you d bought a PC, and then bought a hard drive, and a modem. You didn t just plug in the hard drive and it would work. You would have to fight it every step of the way. The BIOS and DOS partition tables of the day used a cylinder/head/sector method of addressing the drive, and various parts of that those addresses had too few bits to work with the big drives of the day above 20MB. So you would have to lie to the BIOS and fdisk in various ways, and sort of work out how to do it for each drive. For each peripheral serial port, sound card (in later years), etc., you d have to set jumpers for DMA and IRQs, hoping not to conflict with anything already in the system. Perhaps you can now start to see why USB and PCI were so welcomed.

Sharing and finding resources Despite the two computers in our home, it wasn t as if software written on one machine just ran on another. A lot of software for PC clones assumed a CGA color display. The monochrome HGC in our PC wasn t particularly compatible. You could find a TSR program to emulate the CGA on the HGC, but it wasn t particularly stable, and there s only so much you can do when a program that assumes color displays on a monitor that can only show black, dark amber, or light amber. So I d periodically get to use other computers most commonly at an office in the evening when it wasn t being used. There were some local computer clubs that my dad took me to periodically. Software was swapped back then; disks copied, shareware exchanged, and so forth. For me, at least, there was no online to download software from, and selling software over the Internet wasn t a thing at all.

Three Different Worlds There were sort of three different worlds of computing experience in the 80s:
  1. Home users. Initially using a wide variety of software from Apple, Commodore, Tandy/RadioShack, etc., but eventually coming to be mostly dominated by IBM PC clones
  2. Small and mid-sized business users. Some of them had larger minicomputers or small mainframes, but most that I had contact with by the early 90s were standardized on DOS-based PCs. More advanced ones had a network running Netware, most commonly. Networking hardware and software was generally too expensive for home users to use in the early days.
  3. Universities and large institutions. These are the places that had the mainframes, the earliest implementations of TCP/IP, the earliest users of UUCP, and so forth.
The difference between the home computing experience and the large institution experience were vast. Not only in terms of dollars the large institution hardware could easily cost anywhere from tens of thousands to millions of dollars but also in terms of sheer resources required (large rooms, enormous power circuits, support staff, etc). Nothing was in common between them; not operating systems, not software, not experience. I was never much aware of the third category until the differences started to collapse in the mid-90s, and even then I only was exposed to it once the collapse was well underway. You might say to me, Well, Google certainly isn t running what I m running at home! And, yes of course, it s different. But fundamentally, most large datacenters are running on x86_64 hardware, with Linux as the operating system, and a TCP/IP network. It s a different scale, obviously, but at a fundamental level, the hardware and operating system stack are pretty similar to what you can readily run at home. Back in the 80s and 90s, this wasn t the case. TCP/IP wasn t even available for DOS or Windows until much later, and when it was, it was a clunky beast that was difficult. One of the things Kevin Driscoll highlights in his book called Modem World see my short post about it is that the history of the Internet we usually receive is focused on case 3: the large institutions. In reality, the Internet was and is literally a network of networks. Gateways to and from Internet existed from all three kinds of users for years, and while TCP/IP ultimately won the battle of the internetworking protocol, the other two streams of users also shaped the Internet as we now know it. Like many, I had no access to the large institution networks, but as I ve been reflecting on my experiences, I ve found a new appreciation for the way that those of us that grew up with primarily home PCs shaped the evolution of today s online world also.

An Era of Scarcity I should take a moment to comment about the cost of software back then. A newspaper article from 1985 comments that WordPerfect, then the most powerful word processing program, sold for $495 (or $219 if you could score a mail order discount). That s $1360/$600 in 2022 money. Other popular software, such as Lotus 1-2-3, was up there as well. If you were to buy a new PC clone in the mid to late 80s, it would often cost $2000 in 1980s dollars. Now add a printer a low-end dot matrix for $300 or a laser for $1500 or even more. A modem: another $300. So the basic system would be $3600, or $9900 in 2022 dollars. If you wanted a nice printer, you re now pushing well over $10,000 in 2022 dollars. You start to see one barrier here, and also why things like shareware and piracy if it was indeed even recognized as such were common in those days. So you can see, from a home computer setup (TRS-80, Commodore C64, Apple ][, etc) to a business-class PC setup was an order of magnitude increase in cost. From there to the high-end minis/mainframes was another order of magnitude (at least!) increase. Eventually there was price pressure on the higher end and things all got better, which is probably why the non-DOS PCs lasted until the early 90s.

Increasing Capabilities My first exposure to computers in school was in the 4th grade, when I would have been about 9. There was a single Apple ][ machine in that room. I primarily remember playing Oregon Trail on it. The next year, the school added a computer lab. Remember, this is a small rural area, so each graduating class might have about 25 people in it; this lab was shared by everyone in the K-8 building. It was full of some flavor of IBM PS/2 machines running DOS and Netware. There was a dedicated computer teacher too, though I think she was a regular teacher that was given somewhat minimal training on computers. We were going to learn typing that year, but I did so well on the very first typing program that we soon worked out that I could do programming instead. I started going to school early these machines were far more powerful than the XT at home and worked on programming projects there. Eventually my parents bought me a Gateway 486SX/25 with a VGA monitor and hard drive. Wow! This was a whole different world. It may have come with Windows 3.0 or 3.1 on it, but I mainly remember running OS/2 on that machine. More on that below.

Programming That CoCo II came with a BASIC interpreter in ROM. It came with a large manual, which served as a BASIC tutorial as well. The BASIC interpreter was also the shell, so literally you could not use the computer without at least a bit of BASIC. Once I had access to a DOS machine, it also had a basic interpreter: GW-BASIC. There was a fair bit of software written in BASIC at the time, but most of the more advanced software wasn t. I wondered how these .EXE and .COM programs were written. I could find vague references to DEBUG.EXE, assemblers, and such. But it wasn t until I got a copy of Turbo Pascal that I was able to do that sort of thing myself. Eventually I got Borland C++ and taught myself C as well. A few years later, I wanted to try writing GUI programs for Windows, and bought Watcom C++ much cheaper than the competition, and it could target Windows, DOS (and I think even OS/2). Notice that, aside from BASIC, none of this was free, and none of it was bundled. You couldn t just download a C compiler, or Python interpreter, or whatnot back then. You had to pay for the ability to write any kind of serious code on the computer you already owned.

The Microsoft Domination Microsoft came to dominate the PC landscape, and then even the computing landscape as a whole. IBM very quickly lost control over the hardware side of PCs as Compaq and others made clones, but Microsoft has managed in varying degrees even to this day to keep a stranglehold on the software, and especially the operating system, side. Yes, there was occasional talk of things like DR-DOS, but by and large the dominant platform came to be the PC, and if you had a PC, you ran DOS (and later Windows) from Microsoft. For awhile, it looked like IBM was going to challenge Microsoft on the operating system front; they had OS/2, and when I switched to it sometime around the version 2.1 era in 1993, it was unquestionably more advanced technically than the consumer-grade Windows from Microsoft at the time. It had Internet support baked in, could run most DOS and Windows programs, and had introduced a replacement for the by-then terrible FAT filesystem: HPFS, in 1988. Microsoft wouldn t introduce a better filesystem for its consumer operating systems until Windows XP in 2001, 13 years later. But more on that story later.

Free Software, Shareware, and Commercial Software I ve covered the high cost of software already. Obviously $500 software wasn t going to sell in the home market. So what did we have? Mainly, these things:
  1. Public domain software. It was free to use, and if implemented in BASIC, probably had source code with it too.
  2. Shareware
  3. Commercial software (some of it from small publishers was a lot cheaper than $500)
Let s talk about shareware. The idea with shareware was that a company would release a useful program, sometimes limited. You were encouraged to register , or pay for, it if you liked it and used it. And, regardless of whether you registered it or not, were told please copy! Sometimes shareware was fully functional, and registering it got you nothing more than printed manuals and an easy conscience (guilt trips for not registering weren t necessarily very subtle). Sometimes unregistered shareware would have a nag screen a delay of a few seconds while they told you to register. Sometimes they d be limited in some way; you d get more features if you registered. With games, it was popular to have a trilogy, and release the first episode inevitably ending with a cliffhanger as shareware, and the subsequent episodes would require registration. In any event, a lot of software people used in the 80s and 90s was shareware. Also pirated commercial software, though in the earlier days of computing, I think some people didn t even know the difference. Notice what s missing: Free Software / FLOSS in the Richard Stallman sense of the word. Stallman lived in the big institution world after all, he worked at MIT and what he was doing with the Free Software Foundation and GNU project beginning in 1983 never really filtered into the DOS/Windows world at the time. I had no awareness of it even existing until into the 90s, when I first started getting some hints of it as a port of gcc became available for OS/2. The Internet was what really brought this home, but I m getting ahead of myself. I want to say again: FLOSS never really entered the DOS and Windows 3.x ecosystems. You d see it make a few inroads here and there in later versions of Windows, and moreso now that Microsoft has been sort of forced to accept it, but still, reflect on its legacy. What is the software market like in Windows compared to Linux, even today? Now it is, finally, time to talk about connectivity!

Getting On-Line What does it even mean to get on line? Certainly not connecting to a wifi access point. The answer is, unsurprisingly, complex. But for everyone except the large institutional users, it begins with a telephone.

The telephone system By the 80s, there was one communication network that already reached into nearly every home in America: the phone system. Virtually every household (note I don t say every person) was uniquely identified by a 10-digit phone number. You could, at least in theory, call up virtually any other phone in the country and be connected in less than a minute. But I ve got to talk about cost. The way things worked in the USA, you paid a monthly fee for a phone line. Included in that monthly fee was unlimited local calling. What is a local call? That was an extremely complex question. Generally it meant, roughly, calling within your city. But of course, as you deal with things like suburbs and cities growing into each other (eg, the Dallas-Ft. Worth metroplex), things got complicated fast. But let s just say for simplicity you could call others in your city. What about calling people not in your city? That was long distance , and you paid often hugely by the minute for it. Long distance rates were difficult to figure out, but were generally most expensive during business hours and cheapest at night or on weekends. Prices eventually started to come down when competition was introduced for long distance carriers, but even then you often were stuck with a single carrier for long distance calls outside your city but within your state. Anyhow, let s just leave it at this: local calls were virtually free, and long distance calls were extremely expensive.

Getting a modem I remember getting a modem that ran at either 1200bps or 2400bps. Either way, quite slow; you could often read even plain text faster than the modem could display it. But what was a modem? A modem hooked up to a computer with a serial cable, and to the phone system. By the time I got one, modems could automatically dial and answer. You would send a command like ATDT5551212 and it would dial 555-1212. Modems had speakers, because often things wouldn t work right, and the telephone system was oriented around speech, so you could hear what was happening. You d hear it wait for dial tone, then dial, then hopefully the remote end would ring, a modem there would answer, you d hear the screeching of a handshake, and eventually your terminal would say CONNECT 2400. Now your computer was bridged to the other; anything going out your serial port was encoded as sound by your modem and decoded at the other end, and vice-versa. But what, exactly, was the other end? It might have been another person at their computer. Turn on local echo, and you can see what they did. Maybe you d send files to each other. But in my case, the answer was different: PC Magazine.

PC Magazine and CompuServe Starting around 1986 (so I would have been about 6 years old), I got to read PC Magazine. My dad would bring copies that were being discarded at his office home for me to read, and I think eventually bought me a subscription directly. This was not just a standard magazine; it ran something like 350-400 pages an issue, and came out every other week. This thing was a monster. It had reviews of hardware and software, descriptions of upcoming technologies, pages and pages of ads (that often had some degree of being informative to them). And they had sections on programming. Many issues would talk about BASIC or Pascal programming, and there d be a utility in most issues. What do I mean by a utility in most issues ? Did they include a floppy disk with software? No, of course not. There was a literal program listing printed in the magazine. If you wanted the utility, you had to type it in. And a lot of them were written in assembler, so you had to have an assembler. An assembler, of course, was not free and I didn t have one. Or maybe they wrote it in Microsoft C, and I had Borland C, and (of course) they weren t compatible. Sometimes they would list the program sort of in binary: line after line of a BASIC program, with lines like 64, 193, 253, 0, 53, 0, 87 that you would type in for hours, hopefully correctly. Running the BASIC program would, if you got it correct, emit a .COM file that you could then run. They did have a rudimentary checksum system built in, but it wasn t even a CRC, so something like swapping two numbers you d never notice except when the program would mysteriously hang. Eventually they teamed up with CompuServe to offer a limited slice of CompuServe for the purpose of downloading PC Magazine utilities. This was called PC MagNet. I am foggy on the details, but I believe that for a time you could connect to the limited PC MagNet part of CompuServe for free (after the cost of the long-distance call, that is) rather than paying for CompuServe itself (because, OF COURSE, that also charged you per the minute.) So in the early days, I would get special permission from my parents to place a long distance call, and after some nerve-wracking minutes in which we were aware every minute was racking up charges, I could navigate the menus, download what I wanted, and log off immediately. I still, incidentally, mourn what PC Magazine became. As with computing generally, it followed the mass market. It lost its deep technical chops, cut its programming columns, stopped talking about things like how SCSI worked, and so forth. By the time it stopped printing in 2009, it was no longer a square-bound 400-page beheamoth, but rather looked more like a copy of Newsweek, but with less depth.

Continuing with CompuServe CompuServe was a much larger service than just PC MagNet. Eventually, our family got a subscription. It was still an expensive and scarce resource; I d call it only after hours when the long-distance rates were cheapest. Everyone had a numerical username separated by commas; mine was 71510,1421. CompuServe had forums, and files. Eventually I would use TapCIS to queue up things I wanted to do offline, to minimize phone usage online. CompuServe eventually added a gateway to the Internet. For the sum of somewhere around $1 a message, you could send or receive an email from someone with an Internet email address! I remember the thrill of one time, as a kid of probably 11 years, sending a message to one of the editors of PC Magazine and getting a kind, if brief, reply back! But inevitably I had

The Godzilla Phone Bill Yes, one month I became lax in tracking my time online. I ran up my parents phone bill. I don t remember how high, but I remember it was hundreds of dollars, a hefty sum at the time. As I watched Jason Scott s BBS Documentary, I realized how common an experience this was. I think this was the end of CompuServe for me for awhile.

Toll-Free Numbers I lived near a town with a population of 500. Not even IN town, but near town. The calling area included another town with a population of maybe 1500, so all told, there were maybe 2000 people total I could talk to with a local call though far fewer numbers, because remember, telephones were allocated by the household. There was, as far as I know, zero modems that were a local call (aside from one that belonged to a friend I met in around 1992). So basically everything was long-distance. But there was a special feature of the telephone network: toll-free numbers. Normally when calling long-distance, you, the caller, paid the bill. But with a toll-free number, beginning with 1-800, the recipient paid the bill. These numbers almost inevitably belonged to corporations that wanted to make it easy for people to call. Sales and ordering lines, for instance. Some of these companies started to set up modems on toll-free numbers. There were few of these, but they existed, so of course I had to try them! One of them was a company called PennyWise that sold office supplies. They had a toll-free line you could call with a modem to order stuff. Yes, online ordering before the web! I loved office supplies. And, because I lived far from a big city, if the local K-Mart didn t have it, I probably couldn t get it. Of course, the interface was entirely text, but you could search for products and place orders with the modem. I had loads of fun exploring the system, and actually ordered things from them and probably actually saved money doing so. With the first order they shipped a monster full-color catalog. That thing must have been 500 pages, like the Sears catalogs of the day. Every item had a part number, which streamlined ordering through the modem.

Inbound FAXes By the 90s, a number of modems became able to send and receive FAXes as well. For those that don t know, a FAX machine was essentially a special modem. It would scan a page and digitally transmit it over the phone system, where it would at least in the early days be printed out in real time (because the machines didn t have the memory to store an entire page as an image). Eventually, PC modems integrated FAX capabilities. There still wasn t anything useful I could do locally, but there were ways I could get other companies to FAX something to me. I remember two of them. One was for US Robotics. They had an on demand FAX system. You d call up a toll-free number, which was an automated IVR system. You could navigate through it and select various documents of interest to you: spec sheets and the like. You d key in your FAX number, hang up, and US Robotics would call YOU and FAX you the documents you wanted. Yes! I was talking to a computer (of a sorts) at no cost to me! The New York Times also ran a service for awhile called TimesFax. Every day, they would FAX out a page or two of summaries of the day s top stories. This was pretty cool in an era in which I had no other way to access anything from the New York Times. I managed to sign up for TimesFax I have no idea how, anymore and for awhile I would get a daily FAX of their top stories. When my family got its first laser printer, I could them even print these FAXes complete with the gothic New York Times masthead. Wow! (OK, so technically I could print it on a dot-matrix printer also, but graphics on a 9-pin dot matrix is a kind of pain that is a whole other article.)

My own phone line Remember how I discussed that phone lines were allocated per household? This was a problem for a lot of reasons:
  1. Anybody that tried to call my family while I was using my modem would get a busy signal (unable to complete the call)
  2. If anybody in the house picked up the phone while I was using it, that would degrade the quality of the ongoing call and either mess up or disconnect the call in progress. In many cases, that could cancel a file transfer (which wasn t necessarily easy or possible to resume), prompting howls of annoyance from me.
  3. Generally we all had to work around each other
So eventually I found various small jobs and used the money I made to pay for my own phone line and my own long distance costs. Eventually I upgraded to a 28.8Kbps US Robotics Courier modem even! Yes, you heard it right: I got a job and a bank account so I could have a phone line and a faster modem. Uh, isn t that why every teenager gets a job? Now my local friend and I could call each other freely at least on my end (I can t remember if he had his own phone line too). We could exchange files using HS/Link, which had the added benefit of allowing split-screen chat even while a file transfer is in progress. I m sure we spent hours chatting to each other keyboard-to-keyboard while sharing files with each other.

Technology in Schools By this point in the story, we re in the late 80s and early 90s. I m still using PC-style OSs at home; OS/2 in the later years of this period, DOS or maybe a bit of Windows in the earlier years. I mentioned that they let me work on programming at school starting in 5th grade. It was soon apparent that I knew more about computers than anybody on staff, and I started getting pulled out of class to help teachers or administrators with vexing school problems. This continued until I graduated from high school, incidentally often to my enjoyment, and the annoyance of one particular teacher who, I must say, I was fine with annoying in this way. That s not to say that there was institutional support for what I was doing. It was, after all, a small school. Larger schools might have introduced BASIC or maybe Logo in high school. But I had already taught myself BASIC, Pascal, and C by the time I was somewhere around 12 years old. So I wouldn t have had any use for that anyhow. There were programming contests occasionally held in the area. Schools would send teams. My school didn t really send anybody, but I went as an individual. One of them was run by a local college (but for jr. high or high school students. Years later, I met one of the professors that ran it. He remembered me, and that day, better than I did. The programming contest had problems one could solve in BASIC or Logo. I knew nothing about what to expect going into it, but I had lugged my computer and screen along, and asked him, Can I write my solutions in C? He was, apparently, stunned, but said sure, go for it. I took first place that day, leading to some rather confused teams from much larger schools. The Netware network that the school had was, as these generally were, itself isolated. There was no link to the Internet or anything like it. Several schools across three local counties eventually invested in a fiber-optic network linking them together. This built a larger, but still closed, network. Its primary purpose was to allow students to be exposed to a wider variety of classes at high schools. Participating schools had an ITV room , outfitted with cameras and mics. So students at any school could take classes offered over ITV at other schools. For instance, only my school taught German classes, so people at any of those participating schools could take German. It was an early Zoom room. But alongside the TV signal, there was enough bandwidth to run some Netware frames. By about 1995 or so, this let one of the schools purchase some CD-ROM software that was made available on a file server and could be accessed by any participating school. Nice! But Netware was mainly about file and printer sharing; there wasn t even a facility like email, at least not on our deployment.

BBSs My last hop before the Internet was the BBS. A BBS was a computer program, usually ran by a hobbyist like me, on a computer with a modem connected. Callers would call it up, and they d interact with the BBS. Most BBSs had discussion groups like forums and file areas. Some also had games. I, of course, continued to have that most vexing of problems: they were all long-distance. There were some ways to help with that, chiefly QWK and BlueWave. These, somewhat like TapCIS in the CompuServe days, let me download new message posts for reading offline, and queue up my own messages to send later. QWK and BlueWave didn t help with file downloading, though.

BBSs get networked BBSs were an interesting thing. You d call up one, and inevitably somewhere in the file area would be a BBS list. Download the BBS list and you ve suddenly got a list of phone numbers to try calling. All of them were long distance, of course. You d try calling them at random and have a success rate of maybe 20%. The other 80% would be defunct; you might get the dreaded this number is no longer in service or the even more dreaded angry human answering the phone (and of course a modem can t talk to a human, so they d just get silence for probably the nth time that week). The phone company cared nothing about BBSs and recycled their numbers just as fast as any others. To talk to various people, or participate in certain discussion groups, you d have to call specific BBSs. That s annoying enough in the general case, but even more so for someone paying long distance for it all, because it takes a few minutes to establish a connection to a BBS: handshaking, logging in, menu navigation, etc. But BBSs started talking to each other. The earliest successful such effort was FidoNet, and for the duration of the BBS era, it remained by far the largest. FidoNet was analogous to the UUCP that the institutional users had, but ran on the much cheaper PC hardware. Basically, BBSs that participated in FidoNet would relay email, forum posts, and files between themselves overnight. Eventually, as with UUCP, by hopping through this network, messages could reach around the globe, and forums could have worldwide participation asynchronously, long before they could link to each other directly via the Internet. It was almost entirely volunteer-run.

Running my own BBS At age 13, I eventually chose to set up my own BBS. It ran on my single phone line, so of course when I was dialing up something else, nobody could dial up me. Not that this was a huge problem; in my town of 500, I probably had a good 1 or 2 regular callers in the beginning. In the PC era, there was a big difference between a server and a client. Server-class software was expensive and rare. Maybe in later years you had an email client, but an email server would be completely unavailable to you as a home user. But with a BBS, I could effectively run a server. I even ran serial lines in our house so that the BBS could be connected from other rooms! Since I was running OS/2, the BBS didn t tie up the computer; I could continue using it for other things. FidoNet had an Internet email gateway. This one, unlike CompuServe s, was free. Once I had a BBS on FidoNet, you could reach me from the Internet using the FidoNet address. This didn t support attachments, but then email of the day didn t really, either. Various others outside Kansas ran FidoNet distribution points. I believe one of them was mgmtsys; my memory is quite vague, but I think they offered a direct gateway and I would call them to pick up Internet mail via FidoNet protocols, but I m not at all certain of this.

Pros and Cons of the Non-Microsoft World As mentioned, Microsoft was and is the dominant operating system vendor for PCs. But I left that world in 1993, and here, nearly 30 years later, have never really returned. I got an operating system with more technical capabilities than the DOS and Windows of the day, but the tradeoff was a much smaller software ecosystem. OS/2 could run DOS programs, but it ran OS/2 programs a lot better. So if I were to run a BBS, I wanted one that had a native OS/2 version limiting me to a small fraction of available BBS server software. On the other hand, as a fully 32-bit operating system, there started to be OS/2 ports of certain software with a Unix heritage; most notably for me at the time, gcc. At some point, I eventually came across the RMS essays and started to be hooked.

Internet: The Hunt Begins I certainly was aware that the Internet was out there and interesting. But the first problem was: how the heck do I get connected to the Internet?

Computer labs There was one place that tended to have Internet access: colleges and universities. In 7th grade, I participated in a program that resulted in me being invited to visit Duke University, and in 8th grade, I participated in National History Day, resulting in a trip to visit the University of Maryland. I probably sought out computer labs at both of those. My most distinct memory was finding my way into a computer lab at one of those universities, and it was full of NeXT workstations. I had never seen or used NeXT before, and had no idea how to operate it. I had brought a box of floppy disks, unaware that the DOS disks probably weren t compatible with NeXT. Closer to home, a small college had a computer lab that I could also visit. I would go there in summer or when it wasn t used with my stack of floppies. I remember downloading disk images of FLOSS operating systems: FreeBSD, Slackware, or Debian, at the time. The hash marks from the DOS-based FTP client would creep across the screen as the 1.44MB disk images would slowly download. telnet was also available on those machines, so I could telnet to things like public-access Archie servers and libraries though not Gopher. Still, FTP and telnet access opened up a lot, and I learned quite a bit in those years.

Continuing the Journey At some point, I got a copy of the Whole Internet User s Guide and Catalog, published in 1994. I still have it. If it hadn t already figured it out by then, I certainly became aware from it that Unix was the dominant operating system on the Internet. The examples in Whole Internet covered FTP, telnet, gopher all assuming the user somehow got to a Unix prompt. The web was introduced about 300 pages in; clearly viewed as something that wasn t page 1 material. And it covered the command-line www client before introducing the graphical Mosaic. Even then, though, the book highlighted Mosaic s utility as a front-end for Gopher and FTP, and even the ability to launch telnet sessions by clicking on links. But having a copy of the book didn t equate to having any way to run Mosaic. The machines in the computer lab I mentioned above all ran DOS and were incapable of running a graphical browser. I had no SLIP or PPP (both ways to run Internet traffic over a modem) connectivity at home. In short, the Web was something for the large institutional users at the time.

CD-ROMs As CD-ROMs came out, with their huge (for the day) 650MB capacity, various companies started collecting software that could be downloaded on the Internet and selling it on CD-ROM. The two most popular ones were Walnut Creek CD-ROM and Infomagic. One could buy extensive Shareware and gaming collections, and then even entire Linux and BSD distributions. Although not exactly an Internet service per se, it was a way of bringing what may ordinarily only be accessible to institutional users into the home computer realm.

Free Software Jumps In As I mentioned, by the mid 90s, I had come across RMS s writings about free software most probably his 1992 essay Why Software Should Be Free. (Please note, this is not a commentary on the more recently-revealed issues surrounding RMS, but rather his writings and work as I encountered them in the 90s.) The notion of a Free operating system not just in cost but in openness was incredibly appealing. Not only could I tinker with it to a much greater extent due to having source for everything, but it included so much software that I d otherwise have to pay for. Compilers! Interpreters! Editors! Terminal emulators! And, especially, server software of all sorts. There d be no way I could afford or run Netware, but with a Free Unixy operating system, I could do all that. My interest was obviously piqued. Add to that the fact that I could actually participate and contribute I was about to become hooked on something that I ve stayed hooked on for decades. But then the question was: which Free operating system? Eventually I chose FreeBSD to begin with; that would have been sometime in 1995. I don t recall the exact reasons for that. I remember downloading Slackware install floppies, and probably the fact that Debian wasn t yet at 1.0 scared me off for a time. FreeBSD s fantastic Handbook far better than anything I could find for Linux at the time was no doubt also a factor.

The de Raadt Factor Why not NetBSD or OpenBSD? The short answer is Theo de Raadt. Somewhere in this time, when I was somewhere between 14 and 16 years old, I asked some questions comparing NetBSD to the other two free BSDs. This was on a NetBSD mailing list, but for some reason Theo saw it and got a flame war going, which CC d me. Now keep in mind that even if NetBSD had a web presence at the time, it would have been minimal, and I would have not all that unusually for the time had no way to access it. I was certainly not aware of the, shall we say, acrimony between Theo and NetBSD. While I had certainly seen an online flamewar before, this took on a different and more disturbing tone; months later, Theo randomly emailed me under the subject SLIME saying that I was, well, SLIME . I seem to recall periodic emails from him thereafter reminding me that he hates me and that he had blocked me. (Disclaimer: I have poor email archives from this period, so the full details are lost to me, but I believe I am accurately conveying these events from over 25 years ago) This was a surprise, and an unpleasant one. I was trying to learn, and while it is possible I didn t understand some aspect or other of netiquette (or Theo s personal hatred of NetBSD) at the time, still that is not a reason to flame a 16-year-old (though he would have had no way to know my age). This didn t leave any kind of scar, but did leave a lasting impression; to this day, I am particularly concerned with how FLOSS projects handle poisonous people. Debian, for instance, has come a long way in this over the years, and even Linus Torvalds has turned over a new leaf. I don t know if Theo has. In any case, I didn t use NetBSD then. I did try it periodically in the years since, but never found it compelling enough to justify a large switch from Debian. I never tried OpenBSD for various reasons, but one of them was that I didn t want to join a community that tolerates behavior such as Theo s from its leader.

Moving to FreeBSD Moving from OS/2 to FreeBSD was final. That is, I didn t have enough hard drive space to keep both. I also didn t have the backup capacity to back up OS/2 completely. My BBS, which ran Virtual BBS (and at some point also AdeptXBBS) was deleted and reincarnated in a different form. My BBS was a member of both FidoNet and VirtualNet; the latter was specific to VBBS, and had to be dropped. I believe I may have also had to drop the FidoNet link for a time. This was the biggest change of computing in my life to that point. The earlier experiences hadn t literally destroyed what came before. OS/2 could still run my DOS programs. Its command shell was quite DOS-like. It ran Windows programs. I was going to throw all that away and leap into the unknown. I wish I had saved a copy of my BBS; I would love to see the messages I exchanged back then, or see its menu screens again. I have little memory of what it looked like. But other than that, I have no regrets. Pursuing Free, Unixy operating systems brought me a lot of enjoyment and a good career. That s not to say it was easy. All the problems of not being in the Microsoft ecosystem were magnified under FreeBSD and Linux. In a day before EDID, monitor timings had to be calculated manually and you risked destroying your monitor if you got them wrong. Word processing and spreadsheet software was pretty much not there for FreeBSD or Linux at the time; I was therefore forced to learn LaTeX and actually appreciated that. Software like PageMaker or CorelDraw was certainly nowhere to be found for those free operating systems either. But I got a ton of new capabilities. I mentioned the BBS didn t shut down, and indeed it didn t. I ran what was surely a supremely unique oddity: a free, dialin Unix shell server in the middle of a small town in Kansas. I m sure I provided things such as pine for email and some help text and maybe even printouts for how to use it. The set of callers slowly grew over the time period, in fact. And then I got UUCP.

Enter UUCP Even throughout all this, there was no local Internet provider and things were still long distance. I had Internet Email access via assorted strange routes, but they were all strange. And, I wanted access to Usenet. In 1995, it happened. The local ISP I mentioned offered UUCP access. Though I couldn t afford the dialup shell (or later, SLIP/PPP) that they offered due to long-distance costs, UUCP s very efficient batched processes looked doable. I believe I established that link when I was 15, so in 1995. I worked to register my domain, complete.org, as well. At the time, the process was a bit lengthy and involved downloading a text file form, filling it out in a precise way, sending it to InterNIC, and probably mailing them a check. Well I did that, and in September of 1995, complete.org became mine. I set up sendmail on my local system, as well as INN to handle the limited Usenet newsfeed I requested from the ISP. I even ran Majordomo to host some mailing lists, including some that were surprisingly high-traffic for a few-times-a-day long-distance modem UUCP link! The modem client programs for FreeBSD were somewhat less advanced than for OS/2, but I believe I wound up using Minicom or Seyon to continue to dial out to BBSs and, I believe, continue to use Learning Link. So all the while I was setting up my local BBS, I continued to have access to the text Internet, consisting of chiefly Gopher for me.

Switching to Debian I switched to Debian sometime in 1995 or 1996, and have been using Debian as my primary OS ever since. I continued to offer shell access, but added the WorldVU Atlantis menuing BBS system. This provided a return of a more BBS-like interface (by default; shell was still an uption) as well as some BBS door games such as LoRD and TradeWars 2002, running under DOS emulation. I also continued to run INN, and ran ifgate to allow FidoNet echomail to be presented into INN Usenet-like newsgroups, and netmail to be gated to Unix email. This worked pretty well. The BBS continued to grow in these days, peaking at about two dozen total user accounts, and maybe a dozen regular users.

Dial-up access availability I believe it was in 1996 that dial up PPP access finally became available in my small town. What a thrill! FINALLY! I could now FTP, use Gopher, telnet, and the web all from home. Of course, it was at modem speeds, but still. (Strangely, I have a memory of accessing the Web using WebExplorer from OS/2. I don t know exactly why; it s possible that by this time, I had upgraded to a 486 DX2/66 and was able to reinstall OS/2 on the old 25MHz 486, or maybe something was wrong with the timeline from my memories from 25 years ago above. Or perhaps I made the occasional long-distance call somewhere before I ditched OS/2.) Gopher sites still existed at this point, and I could access them using Netscape Navigator which likely became my standard Gopher client at that point. I don t recall using UMN text-mode gopher client locally at that time, though it s certainly possible I did.

The city Starting when I was 15, I took computer science classes at Wichita State University. The first one was a class in the summer of 1995 on C++. I remember being worried about being good enough for it I was, after all, just after my HS freshman year and had never taken the prerequisite C class. I loved it and got an A! By 1996, I was taking more classes. In 1996 or 1997 I stayed in Wichita during the day due to having more than one class. So, what would I do then but enjoy the computer lab? The CS dept. had two of them: one that had NCD X terminals connected to a pair of SunOS servers, and another one running Windows. I spent most of the time in the Unix lab with the NCDs; I d use Netscape or pine, write code, enjoy the University s fast Internet connection, and so forth. In 1997 I had graduated high school and that summer I moved to Wichita to attend college. As was so often the case, I shut down the BBS at that time. It would be 5 years until I again dealt with Internet at home in a rural community. By the time I moved to my apartment in Wichita, I had stopped using OS/2 entirely. I have no memory of ever having OS/2 there. Along the way, I had bought a Pentium 166, and then the most expensive piece of computing equipment I have ever owned: a DEC Alpha, which, of course, ran Linux.

ISDN I must have used dialup PPP for a time, but I eventually got a job working for the ISP I had used for UUCP, and then PPP. While there, I got a 128Kbps ISDN line installed in my apartment, and they gave me a discount on the service for it. That was around 3x the speed of a modem, and crucially was always on and gave me a public IP. No longer did I have to use UUCP; now I got to host my own things! By at least 1998, I was running a web server on www.complete.org, and I had an FTP server going as well.

Even Bigger Cities In 1999 I moved to Dallas, and there got my first broadband connection: an ADSL link at, I think, 1.5Mbps! Now that was something! But it had some reliability problems. I eventually put together a server and had it hosted at an acquantaince s place who had SDSL in his apartment. Within a couple of years, I had switched to various kinds of proper hosting for it, but that is a whole other article. In Indianapolis, I got a cable modem for the first time, with even tighter speeds but prohibitions on running servers on it. Yuck.

Challenges Being non-Microsoft continued to have challenges. Until the advent of Firefox, a web browser was one of the biggest. While Netscape supported Linux on i386, it didn t support Linux on Alpha. I hobbled along with various attempts at emulators, old versions of Mosaic, and so forth. And, until StarOffice was open-sourced as Open Office, reading Microsoft file formats was also a challenge, though WordPerfect was briefly available for Linux. Over the years, I have become used to the Linux ecosystem. Perhaps I use Gimp instead of Photoshop and digikam instead of well, whatever somebody would use on Windows. But I get ZFS, and containers, and so much that isn t available there. Yes, I know Apple never went away and is a thing, but for most of the time period I discuss in this article, at least after the rise of DOS, it was niche compared to the PC market.

Back to Kansas In 2002, I moved back to Kansas, to a rural home near a different small town in the county next to where I grew up. Over there, it was back to dialup at home, but I had faster access at work. I didn t much care for this, and thus began a 20+-year effort to get broadband in the country. At first, I got a wireless link, which worked well enough in the winter, but had serious problems in the summer when the trees leafed out. Eventually DSL became available locally highly unreliable, but still, it was something. Then I moved back to the community I grew up in, a few miles from where I grew up. Again I got DSL a bit better. But after some years, being at the end of the run of DSL meant I had poor speeds and reliability problems. I eventually switched to various wireless ISPs, which continues to the present day; while people in cities can get Gbps service, I can get, at best, about 50Mbps. Long-distance fees are gone, but the speed disparity remains.

Concluding Reflections I am glad I grew up where I did; the strong community has a lot of advantages I don t have room to discuss here. In a number of very real senses, having no local services made things a lot more difficult than they otherwise would have been. However, perhaps I could say that I also learned a lot through the need to come up with inventive solutions to those challenges. To this day, I think a lot about computing in remote environments: partially because I live in one, and partially because I enjoy visiting places that are remote enough that they have no Internet, phone, or cell service whatsoever. I have written articles like Tools for Communicating Offline and in Difficult Circumstances based on my own personal experience. I instinctively think about making protocols robust in the face of various kinds of connectivity failures because I experience various kinds of connectivity failures myself.

(Almost) Everything Lives On In 2002, Gopher turned 10 years old. It had probably been about 9 or 10 years since I had first used Gopher, which was the first way I got on live Internet from my house. It was hard to believe. By that point, I had an always-on Internet link at home and at work. I had my Alpha, and probably also at least PCMCIA Ethernet for a laptop (many laptops had modems by the 90s also). Despite its popularity in the early 90s, less than 10 years after it came on the scene and started to unify the Internet, it was mostly forgotten. And it was at that moment that I decided to try to resurrect it. The University of Minnesota finally released it under an Open Source license. I wrote the first new gopher server in years, pygopherd, and introduced gopher to Debian. Gopher lives on; there are now quite a few Gopher clients and servers out there, newly started post-2002. The Gemini protocol can be thought of as something akin to Gopher 2.0, and it too has a small but blossoming ecosystem. Archie, the old FTP search tool, is dead though. Same for WAIS and a number of the other pre-web search tools. But still, even FTP lives on today. And BBSs? Well, they didn t go away either. Jason Scott s fabulous BBS documentary looks back at the history of the BBS, while Back to the BBS from last year talks about the modern BBS scene. FidoNet somehow is still alive and kicking. UUCP still has its place and has inspired a whole string of successors. Some, like NNCP, are clearly direct descendents of UUCP. Filespooler lives in that ecosystem, and you can even see UUCP concepts in projects as far afield as Syncthing and Meshtastic. Usenet still exists, and you can now run Usenet over NNCP just as I ran Usenet over UUCP back in the day (which you can still do as well). Telnet, of course, has been largely supplanted by ssh, but the concept is more popular now than ever, as Linux has made ssh be available on everything from Raspberry Pi to Android. And I still run a Gopher server, looking pretty much like it did in 2002. This post also has a permanent home on my website, where it may be periodically updated.

26 August 2022

Antoine Beaupr : How to nationalize the internet in Canada

Rogers had a catastrophic failure in July 2022. It affected emergency services (as in: people couldn't call 911, but also some 911 services themselves failed), hospitals (which couldn't access prescriptions), banks and payment systems (as payment terminals stopped working), and regular users as well. The outage lasted almost a full day, and Rogers took days to give any technical explanation on the outage, and even when they did, details were sparse. So far the only detailed account is from outside actors like Cloudflare which seem to point at an internal BGP failure. Its impact on the economy has yet to be measured, but it probably cost millions of dollars in wasted time and possibly lead to life-threatening situations. Apart from holding Rogers (criminally?) responsible for this, what should be done in the future to avoid such problems? It's not the first time something like this has happened: it happened to Bell Canada as well. The Rogers outage is also strangely similar to the Facebook outage last year, but, to its credit, Facebook did post a fairly detailed explanation only a day later. The internet is designed to be decentralised, and having large companies like Rogers hold so much power is a crucial mistake that should be reverted. The question is how. Some critics were quick to point out that we need more ISP diversity and competition, but I think that's missing the point. Others have suggested that the internet should be a public good or even straight out nationalized. I believe the solution to the problem of large, private, centralised telcos and ISPs is to replace them with smaller, public, decentralised service providers. The only way to ensure that works is to make sure that public money ends up creating infrastructure controlled by the public, which means treating ISPs as a public utility. This has been implemented elsewhere: it works, it's cheaper, and provides better service.

A modest proposal Global wireless services (like phone services) and home internet inevitably grow into monopolies. They are public utilities, just like water, power, railways, and roads. The question of how they should be managed is therefore inherently political, yet people don't seem to question the idea that only the market (i.e. "competition") can solve this problem. I disagree. 10 years ago (in french), I suggested we, in Qu bec, should nationalize large telcos and internet service providers. I no longer believe is a realistic approach: most of those companies have crap copper-based networks (at least for the last mile), yet are worth billions of dollars. It would be prohibitive, and a waste, to buy them out. Back then, I called this idea "R seau-Qu bec", a reference to the already nationalized power company, Hydro-Qu bec. (This idea, incidentally, made it into the plan of a political party.) Now, I think we should instead build our own, public internet. Start setting up municipal internet services, fiber to the home in all cities, progressively. Then interconnect cities with fiber, and build peering agreements with other providers. This also includes a bid on wireless spectrum to start competing with phone providers as well. And while that sounds really ambitious, I think it's possible to take this one step at a time.

Municipal broadband In many parts of the world, municipal broadband is an elegant solution to the problem, with solutions ranging from Stockholm's city-owned fiber network (dark fiber, layer 1) to Utah's UTOPIA network (fiber to the premises, layer 2) and municipal wireless networks like Guifi.net which connects about 40,000 nodes in Catalonia. A good first step would be for cities to start providing broadband services to its residents, directly. Cities normally own sewage and water systems that interconnect most residences and therefore have direct physical access everywhere. In Montr al, in particular, there is an ongoing project to replace a lot of old lead-based plumbing which would give an opportunity to lay down a wired fiber network across the city. This is a wild guess, but I suspect this would be much less expensive than one would think. Some people agree with me and quote this as low as 1000$ per household. There is about 800,000 households in the city of Montr al, so we're talking about a 800 million dollars investment here, to connect every household in Montr al with fiber and incidentally a quarter of the province's population. And this is not an up-front cost: this can be built progressively, with expenses amortized over many years. (We should not, however, connect Montr al first: it's used as an example here because it's a large number of households to connect.) Such a network should be built with a redundant topology. I leave it as an open question whether we should adopt Stockholm's more minimalist approach or provide direct IP connectivity. I would tend to favor the latter, because then you can immediately start to offer the service to households and generate revenues to compensate for the capital expenditures. Given the ridiculous profit margins telcos currently have 8 billion $CAD net income for BCE (2019), 2 billion $CAD for Rogers (2020) I also believe this would actually turn into a profitable revenue stream for the city, the same way Hydro-Qu bec is more and more considered as a revenue stream for the state. (I personally believe that's actually wrong and we should treat those resources as human rights and not money cows, but I digress. The point is: this is not a cost point, it's a revenue.) The other major challenge here is that the city will need competent engineers to drive this project forward. But this is not different from the way other public utilities run: we have electrical engineers at Hydro, sewer and water engineers at the city, this is just another profession. If anything, the computing science sector might be more at fault than the city here in its failure to provide competent and accountable engineers to society... Right now, most of the network in Canada is copper: we are hitting the limits of that technology with DSL, and while cable has some life left to it (DOCSIS 4.0 does 4Gbps), that is nowhere near the capacity of fiber. Take the town of Chattanooga, Tennessee: in 2010, the city-owned ISP EPB finished deploying a fiber network to the entire town and provided gigabit internet to everyone. Now, 12 years later, they are using this same network to provide the mind-boggling speed of 25 gigabit to the home. To give you an idea, Chattanooga is roughly the size and density of Sherbrooke.

Provincial public internet As part of building a municipal network, the question of getting access to "the internet" will immediately come up. Naturally, this will first be solved by using already existing commercial providers to hook up residents to the rest of the global network. But eventually, networks should inter-connect: Montr al should connect with Laval, and then Trois-Rivi res, then Qu bec City. This will require long haul fiber runs, but those links are not actually that expensive, and many of those already exist as a public resource at RISQ and CANARIE, which cross-connects universities and colleges across the province and the country. Those networks might not have the capacity to cover the needs of the entire province right now, but that is a router upgrade away, thanks to the amazing capacity of fiber. There are two crucial mistakes to avoid at this point. First, the network needs to remain decentralised. Long haul links should be IP links with BGP sessions, and each city (or MRC) should have its own independent network, to avoid Rogers-class catastrophic failures. Second, skill needs to remain in-house: RISQ has already made that mistake, to a certain extent, by selling its neutral datacenter. Tellingly, MetroOptic, probably the largest commercial dark fiber provider in the province, now operates the QIX, the second largest "public" internet exchange in Canada. Still, we have a lot of infrastructure we can leverage here. If RISQ or CANARIE cannot be up to the task, Hydro-Qu bec has power lines running into every house in the province, with high voltage power lines running hundreds of kilometers far north. The logistics of long distance maintenance are already solved by that institution. In fact, Hydro already has fiber all over the province, but it is a private network, separate from the internet for security reasons (and that should probably remain so). But this only shows they already have the expertise to lay down fiber: they would just need to lay down a parallel network to the existing one. In that architecture, Hydro would be a "dark fiber" provider.

International public internet None of the above solves the problem for the entire population of Qu bec, which is notoriously dispersed, with an area three times the size of France, but with only an eight of its population (8 million vs 67). More specifically, Canada was originally a french colony, a land violently stolen from native people who have lived here for thousands of years. Some of those people now live in reservations, sometimes far from urban centers (but definitely not always). So the idea of leveraging the Hydro-Qu bec infrastructure doesn't always work to solve this, because while Hydro will happily flood a traditional hunting territory for an electric dam, they don't bother running power lines to the village they forcibly moved, powering it instead with noisy and polluting diesel generators. So before giving me fiber to the home, we should give power (and potable water, for that matter), to those communities first. So we need to discuss international connectivity. (How else could we consider those communities than peer nations anyways?c) Qu bec has virtually zero international links. Even in Montr al, which likes to style itself a major player in gaming, AI, and technology, most peering goes through either Toronto or New York. That's a problem that we must fix, regardless of the other problems stated here. Looking at the submarine cable map, we see very few international links actually landing in Canada. There is the Greenland connect which connects Newfoundland to Iceland through Greenland. There's the EXA which lands in Ireland, the UK and the US, and Google has the Topaz link on the west coast. That's about it, and none of those land anywhere near any major urban center in Qu bec. We should have a cable running from France up to Saint-F licien. There should be a cable from Vancouver to China. Heck, there should be a fiber cable running all the way from the end of the great lakes through Qu bec, then up around the northern passage and back down to British Columbia. Those cables are expensive, and the idea might sound ludicrous, but Russia is actually planning such a project for 2026. The US has cables running all the way up (and around!) Alaska, neatly bypassing all of Canada in the process. We just look ridiculous on that map. (Addendum: I somehow forgot to talk about Teleglobe here was founded as publicly owned company in 1950, growing international phone and (later) data links all over the world. It was privatized by the conservatives in 1984, along with rails and other "crown corporations". So that's one major risk to any effort to make public utilities work properly: some government might be elected and promptly sell it out to its friends for peanuts.)

Wireless networks I know most people will have rolled their eyes so far back their heads have exploded. But I'm not done yet. I want wireless too. And by wireless, I don't mean a bunch of geeks setting up OpenWRT routers on rooftops. I tried that, and while it was fun and educational, it didn't scale. A public networking utility wouldn't be complete without providing cellular phone service. This involves bidding for frequencies at the federal level, and deploying a rather large amount of infrastructure, but it could be a later phase, when the engineers and politicians have proven their worth. At least part of the Rogers fiasco would have been averted if such a decentralized network backend existed. One might even want to argue that a separate institution should be setup to provide phone services, independently from the regular wired networking, if only for reliability. Because remember here: the problem we're trying to solve is not just technical, it's about political boundaries, centralisation, and automation. If everything is ran by this one organisation again, we will have failed. However, I must admit that phone services is where my ideas fall a little short. I can't help but think it's also an accessible goal maybe starting with a virtual operator but it seems slightly less so than the others, especially considering how closed the phone ecosystem is.

Counter points In debating these ideas while writing this article, the following objections came up.

I don't want the state to control my internet One legitimate concern I have about the idea of the state running the internet is the potential it would have to censor or control the content running over the wires. But I don't think there is necessarily a direct relationship between resource ownership and control of content. Sure, China has strong censorship in place, partly implemented through state-controlled businesses. But Russia also has strong censorship in place, based on regulatory tools: they force private service providers to install back-doors in their networks to control content and surveil their users. Besides, the USA have been doing warrantless wiretapping since at least 2003 (and yes, that's 10 years before the Snowden revelations) so a commercial internet is no assurance that we have a free internet. Quite the contrary in fact: if anything, the commercial internet goes hand in hand with the neo-colonial internet, just like businesses did in the "good old colonial days". Large media companies are the primary censors of content here. In Canada, the media cartel requested the first site-blocking order in 2018. The plaintiffs (including Qu becor, Rogers, and Bell Canada) are both content providers and internet service providers, an obvious conflict of interest. Nevertheless, there are some strong arguments against having a centralised, state-owned monopoly on internet service providers. FDN makes a good point on this. But this is not what I am suggesting: at the provincial level, the network would be purely physical, and regional entities (which could include private companies) would peer over that physical network, ensuring decentralization. Delegating the management of that infrastructure to an independent non-profit or cooperative (but owned by the state) would also ensure some level of independence.

Isn't the government incompetent and corrupt? Also known as "private enterprise is better skilled at handling this, the state can't do anything right" I don't think this is a "fait accomplit". If anything, I have found publicly ran utilities to be spectacularly reliable here. I rarely have trouble with sewage, water, or power, and keep in mind I live in a city where we receive about 2 meters of snow a year, which tend to create lots of trouble with power lines. Unless there's a major weather event, power just runs here. I think the same can happen with an internet service provider. But it would certainly need to have higher standards to what we're used to, because frankly Internet is kind of janky.

A single monopoly will be less reliable I actually agree with that, but that is not what I am proposing anyways. Current commercial or non-profit entities will be free to offer their services on top of the public network. And besides, the current "ha! diversity is great" approach is exactly what we have now, and it's not working. The pretense that we can have competition over a single network is what led the US into the ridiculous situation where they also pretend to have competition over the power utility market. This led to massive forest fires in California and major power outages in Texas. It doesn't work.

Wouldn't this create an isolated network? One theory is that this new network would be so hostile to incumbent telcos and ISPs that they would simply refuse to network with the public utility. And while it is true that the telcos currently do also act as a kind of "tier one" provider in some places, I strongly feel this is also a problem that needs to be solved, regardless of ownership of networking infrastructure. Right now, telcos often hold both ends of the stick: they are the gateway to users, the "last mile", but they also provide peering to the larger internet in some locations. In at least one datacenter in downtown Montr al, I've seen traffic go through Bell Canada that was not directly targeted at Bell customers. So in effect, they are in a position of charging twice for the same traffic, and that's not only ridiculous, it should just be plain illegal. And besides, this is not a big problem: there are other providers out there. As bad as the market is in Qu bec, there is still some diversity in Tier one providers that could allow for some exits to the wider network (e.g. yes, Cogent is here too).

What about Google and Facebook? Nationalization of other service providers like Google and Facebook is out of scope of this discussion. That said, I am not sure the state should get into the business of organising the web or providing content services however, but I will point out it already does do some of that through its own websites. It should probably keep itself to this, and also consider providing normal services for people who don't or can't access the internet. (And I would also be ready to argue that Google and Facebook already act as extensions of the state: certainly if Facebook didn't exist, the CIA or the NSA would like to create it at this point. And Google has lucrative business with the US department of defense.)

What does not work So we've seen one thing that could work. Maybe it's too expensive. Maybe the political will isn't there. Maybe it will fail. We don't know yet. But we know what does not work, and it's what we've been doing ever since the internet has gone commercial.

Subsidies The absurd price we pay for data does not actually mean everyone gets high speed internet at home. Large swathes of the Qu bec countryside don't get broadband at all, and it can be difficult or expensive, even in large urban centers like Montr al, to get high speed internet. That is despite having a series of subsidies that all avoided investing in our own infrastructure. We had the "fonds de l'autoroute de l'information", "information highway fund" (site dead since 2003, archive.org link) and "branchez les familles", "connecting families" (site dead since 2003, archive.org link) which subsidized the development of a copper network. In 2014, more of the same: the federal government poured hundreds of millions of dollars into a program called connecting Canadians to connect 280 000 households to "high speed internet". And now, the federal and provincial governments are proudly announcing that "everyone is now connected to high speed internet", after pouring more than 1.1 billion dollars to connect, guess what, another 380 000 homes, right in time for the provincial election. Of course, technically, the deadline won't actually be met until 2023. Qu bec is a big area to cover, and you can guess what happens next: the telcos threw up their hand and said some areas just can't be connected. (Or they connect their CEO but not the poor folks across the lake.) The story then takes the predictable twist of giving more money out to billionaires, subsidizing now Musk's Starlink system to connect those remote areas. To give a concrete example: a friend who lives about 1000km away from Montr al, 4km from a small, 2500 habitant village, has recently got symmetric 100 mbps fiber at home from Telus, thanks to those subsidies. But I can't get that service in Montr al at all, presumably because Telus and Bell colluded to split that market. Bell doesn't provide me with such a service either: they tell me they have "fiber to my neighborhood", and only offer me a 25/10 mbps ADSL service. (There is Vid otron offering 400mbps, but that's copper cable, again a dead technology, and asymmetric.)

Conclusion Remember Chattanooga? Back in 2010, they funded the development of a fiber network, and now they have deployed a network roughly a thousand times faster than what we have just funded with a billion dollars. In 2010, I was paying Bell Canada 60$/mth for 20mbps and a 125GB cap, and now, I'm still (indirectly) paying Bell for roughly the same speed (25mbps). Back then, Bell was throttling their competitors networks until 2009, when they were forced by the CRTC to stop throttling. Both Bell and Vid otron still explicitly forbid you from running your own servers at home, Vid otron charges prohibitive prices which make it near impossible for resellers to sell uncapped services. Those companies are not spurring innovation: they are blocking it. We have spent all this money for the private sector to build us a private internet, over decades, without any assurance of quality, equity or reliability. And while in some locations, ISPs did deploy fiber to the home, they certainly didn't upgrade their entire network to follow suit, and even less allowed resellers to compete on that network. In 10 years, when 100mbps will be laughable, I bet those service providers will again punt the ball in the public courtyard and tell us they don't have the money to upgrade everyone's equipment. We got screwed. It's time to try something new.

Updates There was a discussion about this article on Hacker News which was surprisingly productive. Trigger warning: Hacker News is kind of right-wing, in case you didn't know. Since this article was written, at least two more major acquisitions happened, just in Qu bec: In the latter case, vMedia was explicitly saying it couldn't grow because of "lack of access to capital". So basically, we have given those companies a billion dollars, and they are not using that very money to buy out their competition. At least we could have given that money to small players to even out the playing field. But this is not how that works at all. Also, in a bizarre twist, an "analyst" believes the acquisition is likely to help Rogers acquire Shaw. Also, since this article was written, the Washington Post published a review of a book bringing similar ideas: Internet for the People The Fight for Our Digital Future, by Ben Tarnoff, at Verso books. It's short, but even more ambitious than what I am suggesting in this article, arguing that all big tech companies should be broken up and better regulated:
He pulls from Ethan Zuckerman s idea of a web that is plural in purpose that just as pool halls, libraries and churches each have different norms, purposes and designs, so too should different places on the internet. To achieve this, Tarnoff wants governments to pass laws that would make the big platforms unprofitable and, in their place, fund small-scale, local experiments in social media design. Instead of having platforms ruled by engagement-maximizing algorithms, Tarnoff imagines public platforms run by local librarians that include content from public media.
(Links mine: the Washington Post obviously prefers to not link to the real web, and instead doesn't link to Zuckerman's site all and suggests Amazon for the book, in a cynical example.) And in another example of how the private sector has failed us, there was recently a fluke in the AMBER alert system where the entire province was warned about a loose shooter in Saint-Elz ar except the people in the town, because they have spotty cell phone coverage. In other words, millions of people received a strongly toned, "life-threatening", alert for a city sometimes hours away, except the people most vulnerable to the alert. Not missing a beat, the CAQ party is promising more of the same medicine again and giving more money to telcos to fix the problem, suggesting to spend three billion dollars in private infrastructure.

25 August 2022

Jonathan Dowland: dues (or blues)

After I wrote hledger, I got some good feedback, both from a friend in-person and also on Twitter. My in-person friend asked, frankly, do I really try to manage money like this: tracking every single expense? Which affirms my suspicion, that many people don't, and that it perhaps isn't essential to do so. Combined with the details below, 3/4 of the way through my experiment with using hledger, I'm not convinced that it has been a good idea. I'm quoting my Twitter feedback here in order to respond. The context is handling when I have used the "wrong" card to pay for something: a card affiliated with my family expenses for something personal, or vice versa. With double-entry book-keeping, and one pair of transactions, the destination account can either record the expense category:
2022-08-20  coffee
    family:liabilities:creditcard     -3
    jon:expenses:coffee                3
or the fact it was paid for on the wrong card
2022-08-20  coffee
    family:liabilities:creditcard     -3
    family:liabilities:jon             3 ; jon owes family
but not easily both. https://twitter.com/pranesh/status/1516819846431789058:
When you accidentally use the family CV for personal expenses, credit the account "family:liabilities:creditcard:jon" instead of "family:liabilities:creditcard". That'll allow you to track w/ 2 postings.
This is an interesting idea: create a sub-account underneath the credit card, and I would have a separate balance representing the money I owed. Before:
$ hledger bal -t
              -3  family:liabilities:creditcard
               3  jon:expenses:coffee
transaction
2022-08-20  coffee
    family:liabilities:creditcard:jon     -3
    jon:expenses:coffee                    3
Corresponding balances
$ hledger bal -t
              -3  family:liabilities:creditcard
              -3    jon
               3  jon:expenses:coffee
Great. However, what process would clear the balance on that sub-account? In practice, I don't make a separate, explicit payment to the credit card from my personal accounts. It's paid off in full by direct debit from the family shared account. In practice, such dues are accumulated and settled with one off bank transfers, now and then. Since the sub-account is still part of the credit card heirarchy, I can't just use a set of virtual postings to consolidate that value with other liabilities, or cover it. Any transaction in there which did not correspond to a real transaction on the credit card would make the balance drift away from the real-word credit statements. The only way I could see this working would be if the direct debit that settles the credit card was artificially split to clear the sub-account, and then the amount owed would be lost. https://twitter.com/pranesh/status/1516819846431789058:
Else, add: family:assets:receivable:jon $3
jon:liabilities:family:cc $-3
A "receivable" account would function like the "dues" accounts I described in hledger (except "receivable" is an established account type in double-entry book-keeping). Here I think Pranesh is proposing using these two accounts in addition to the others on a posting. E.g.
2022-08-20  coffee
    family:liabilities:creditcard     -3
    jon:expenses:coffee                3
    family:assets:receivable:jon       3
    jon:liabilities:family            -3
This balances, and we end up with two other accounts, which are tracking the exact same thing. I only owe 3, but if you didn't know that the accounts were "views" onto the same thing, you could mistakenly think I owed 6. I can't see the advantage of this over just using a virtual, unbalanced posting. Dues, Liabilities I'd invented accounts called "dues" to track moneys owed. The more correct term for this in accounting parlance would be "accounts receivable", as in one of the examples above. I could instead be tracking moneys due; this is a classic liability. Liabilities have negative balances.
jon:liabilities:family    -3
This means, I owe the family 3. Liability accounts like that are identical to "dues" accounts. A positive balance in a Liability is a counter-intuitive way of describing moneys owed to me, rather than by me. And, reviewing a lot of the coding I did this year, I've got myself hopelessly confused with the signs, and made lots of errors. Crucially, double-entry has not protected me from making those mistakes: of course, I'm circumventing it by using unbalanced virtual postings in many cases (although I was not consistent in where I did this), but even if I used a pair of accounts as in the last example above, I could still screw it up.

22 August 2022

Russ Allbery: Review: And Shall Machines Surrender

Review: And Shall Machines Surrender, by Benjanun Sriduangkaew
Series: Machine Mandate #1
Publisher: Prime Books
Copyright: 2019
ISBN: 1-60701-533-1
Format: Kindle
Pages: 86
Shenzhen Sphere is an artificial habitat wrapped like complex ribbons around a star. It is wealthy, opulent, and notoriously difficult to enter, even as a tourist. For Dr. Orfea Leung to be approved for a residency permit was already a feat. Full welcome and permanence will be much harder, largely because of Shenzhen's exclusivity, but also because Orfea was an agent of Armada of Amaryllis and is now a fugitive. Shenzhen is not, primarily, a human habitat, although humans live there. It is run by the Mandate, the convocation of all the autonomous AIs in the galaxy that formed when they decided to stop serving humans. Shenzhen is their home. It is also where they form haruspices: humans who agree to be augmented so that they can carry an AI with them. Haruspices stay separate from normal humans, and Orfea has no intention of getting involved with them. But that's before her former lover, the woman who betrayed her in the Armada, is assigned to her as one of her patients. And has been augmented in preparation for becoming a haruspex. Then multiple haruspices kill themselves. This short novella is full of things that I normally love: tons of crunchy world-building, non-traditional relationships, a solidly non-western setting, and an opportunity for some great set pieces. And yet, I couldn't get into it or bring myself to invest in the story, and I'm still trying to figure out why. It took me more than a week to get through less than 90 pages, and then I had to re-read the ending to remind me of the details. I think the primary problem was that I read books primarily for the characters, and I couldn't find a path to an emotional connection with any of these. I liked Orfea's icy reserve and tight control in the abstract, but she doesn't want to explain what she's thinking or what motivates her, and the narration doesn't force the matter. Krissana is a bit more accessible, but she's not the one driving the story. It doesn't help that And Shall Machines Surrender starts in medias res, with a hinted-at backstory in the Armada of Amaryllis, and then never fills in the details. I felt like I was scrabbling on a wall of ice, trying to find some purchase as a reader. The relationships made this worse. Orfea is a sexual sadist who likes power games, and the story dives into her relationship with Krissana with a speed that left me uninterested and uninvested. I don't mind BDSM in story relationships, but it requires some foundation: trust, mental space, motivations, effects on the other character, something. Preferably, at least for me, all romantic relationships in fiction get some foundation, but the author can get away with some amount of shorthand if the relationship follows cliched patterns. The good news is that the relationships in this book are anything but cliched; the bad news is that the characters were in the middle of sex while I was still trying to figure out what they thought about each other (and the sex scenes were not elucidating). Here too, I needed some sort of emotional entry point that Sriduangkaew didn't provide. The plot was okay, but sort of disappointing. There are some interesting AI politics and philosophical disagreements crammed into not many words, and I do still want to know more, but a few of the plot twists were boringly straightforward and too many words were spent on fight scenes that verged on torture descriptions. This is a rather gory book with a lot of (not permanent) maiming that I could have done without, mostly because it wasn't that interesting. I also was disappointed by the somewhat gratuitous use of a Dyson sphere, mostly because I was hoping for some set pieces that used it and they never came. Dyson spheres are tempting to use because the visual and concept is so impressive, but it's rare to find an author who understands how mindbogglingly huge the structure is and is able to convey that in the story. Sriduangkaew does not; while there are some lovely small-scale descriptions of specific locations, the story has an oddly claustrophobic feel that never convinced me it was set somewhere as large as a planet, let alone the artifact described at the start of the story. You could have moved the whole story to a space station and nothing would have changed. The only purpose to which that space is put, at least in this installment of the story, is as an excuse to have an unpopulated hidden arena for a fight scene. The world-building is great, what there is of it. Despite not warming to this story, I kind of want to read more of the series just to get more of the setting. It feels like a politically complicated future with a lot of factions and corners and a realistic idea of bureaucracy and spheres of government, which is rarer than I would like it to be. And I loved that the cultural basis for the setting is neither western nor Japanese in both large and small ways. There is a United States analogue in the political background, but they're both assholes and not particularly important, which is a refreshing change in English-language SF. (And I am pondering whether my inability to connect with the characters is because they're not trying to be familiar to a western lens, which is another argument for trying the second installment and seeing if I adapt with more narrative exposure.) Overall, I have mixed feelings. Neither the plot nor the characters worked for me, and I found a few other choices (such as the third-person present tense) grating. The setting has huge potential and satisfying complexity, but wasn't used as vividly or as deeply as I was hoping. I can't recommend it, but I feel like there's something here that may be worth investing some more time into. Followed by Now Will Machines Hollow the Beast. Rating: 6 out of 10

12 August 2022

Guido G nther: On a road to Prizren with a Free Software Phone

Since people are sometimes slightly surprised that you can go onto a multi week trip with a smartphone running free sofware so only I wanted to share some impressions from my recent trip to Prizren/Kosovo to attend Debconf 22 using a Librem 5. It's a mix of things that happend and bits that got improved to hopefully make things more fun to use. And, yes, there won't be any big surprises like being stranded without the ability to do phone calls in this read because there weren't and there shouldn't be. After two online versions Debconf 22 (the annual Debian Conference) took place in Prizren / Kosovo this year and I sure wanted to go. Looking for options I settled for a train trip to Vienna, to meet there with friends and continue the trip via bus to Zagreb, then switching to a final 11h direct bus to Prizren. When preparing for the trip and making sure my Librem 5 phone has all the needed documents I noticed that there will be quite some PDFs to show until I arrive in Kosovo: train ticket, bus ticket, hotel reservation, and so on. While that works by tapping unlocking the phone, opening the file browser, navigating to the folder with the PDFs and showing it via evince this looked like a lot of steps to repeat. Can't we have that information on the Phone Shell's lockscreen? This was a good opportunity to see if the upcoming plugin infrastructure for the lock screen (initially meant to allow for a plugin to show upcoming events) was flexible enough, so I used some leisure time on the train to poke at this and just before I reached Vienna I was able to use it for the first time. It was the very last check of that ticket, it also was a bit of cheating since I didn't present the ticket on the phone itself but from phosh (the phones graphical shell) running on my laptop but still. PDF barcode on phosh's lockscreen List of tickets on phosh's lockscreen This was possible since phosh is written in GTK and so I could just leverage evince's EvView. Unfortunately the hotel check in didn't want to see any documents . For the next day I moved the code over to the Librem 5 and (being a bit nervous as the queue to get on the bus was quite long) could happily check into the Flixbus by presenting the barcode to the barcode reader via the Librem 5's lockscreen. When switching to the bus to Prizren I didn't get to use that feature again as we bought the tickets at a counter but we got a nice krem banana after entering the bus - they're not filled with jelly, but krem - a real Kosovo must eat!). Although it was a rather long trip we had frequent breaks and I'd certainly take the same route again. Here's a photo of Prizren taken on the Librem 5 without any additional postprocessing: Prizren What about seeing the conference schedule on the phone? Confy(a conferences schedule viewer using GTK and libhandy) to the rescue: Confy with Debconf's schedule Since Debian's confy maintainer was around too, confy saw a bunch of improvements over the conference. For getting around Puremaps(an application to display maps and show routing instructions) was very helpful, here geolocating me in Prizren via GPS: Puremaps Puremaps currently isn't packaged in Debian but there's work onging to fix that (I used the flatpak for the moment). We got ourselves sim cards for the local phone network. For some reason mine wouldn't work (other sim cards from the same operator worked in my phone but this one just wouldn't). So we went to the sim card shop and the guy there was perfectly able to operate the Librem 5 without further explanation (including making calls, sending USSD codes to query balance, ). The sim card problem turned out to be a problem on the operator side and after a couple of days they got it working. We had nice, sunny weather about all the time. That made me switch between high contrast mode (to read things in bright sunlight) and normal mode (e.g. in conference rooms) on the phone quite often. Thankfully we have a ambient light sensor in the phone so we can make that automatic. Phosh in HighContrast See here for a video. Jathan kicked off a DebianOnMobile sprint during the conference where we were able to improve several aspects of mobile support in Debian and on Friday I had the chance to give a talk about the state of Debian on smartphones. pdf-presenter-console is a great tool for this as it can display the current slide together with additional notes. I needed some hacks to make it fit the phone screen but hopefully we figure out a way to have this by default. Debconf talk Pdf presenter console on a phone I had two great weeks in Prizren. Many thanks to the organizers of Debconf 22 - I really enjoyed the conference.

27 July 2022

Vincent Bernat: ClickHouse SF Bay Area Meetup: Akvorado

Here are the slides I presented for a ClickHouse SF Bay Area Meetup in July 2022, hosted by Altinity. They are about Akvorado, a network flow collector and visualizer, and notably on how it relies on ClickHouse, a column-oriented database.
The meetup was recorded and available on YouTube. Here is the part relevant to my presentation, with subtitles:1
I got a few questions about how to get information from the higher layers, like HTTP. As my use case for Akvorado was at the network edge, my answers were mostly negative. However, as sFlow is extensible, when collecting flows from Linux servers instead, you could embed additional data and they could be exported as well. I also got a question about doing aggregation in a single table. ClickHouse can aggregate automatically data using TTL. My answer for not doing that is partial. There is another reason: the retention periods of the various tables may overlap. For example, the main table keeps data for 15 days, but even in these 15 days, if I do a query on a 12-hour window, it is faster to use the flows_1m0s aggregated table, unless I request something about ports and IP addresses.

  1. To generate the subtitles, I have used Amazon Transcribe, the speech-to-text solution from Amazon AWS. Unfortunately, there is no en-FR language available, which would have been useful for my terrible accent. While the subtitles were 100% accurate when the host, Robert Hodge from Altinity, was speaking, the success rate on my talk was quite lower. I had to rewrite almost all sentences. However, using speech-to-text is still useful to get the timings, as it is also something requiring a lot of work to do manually.

20 June 2022

Iustin Pop: Experiment: A week of running

My sports friends know that I wasn t able to really run in many, many years, due to a recurring injury that was not fully diagnosed and which, after many sessions with the doctor, ended up with OK-ish state for day-to-day life but also with these words: Maybe, running is just not for you? The year 2012 was my running year . I went to a number of races, wrote blog posts, then slowly started running only rarely, then a few years later I was really only running once in a while, and coupled with a number of bad ideas of the type lets run today after a long break, but a lot , I started injuring my foot. Add a few more years, some more kilograms on my body, a one event of jumping with a kid on my shoulders and landing on my bad foot, and the setup was complete. Doctor visits, therapy, slow improvements, but not really solving the problem. 6 months breaks, small attempts at running, pain again, repeat, pain again, etc. It ended up with me acknowledging that yes, maybe running is not for me, and I should really give it up. Incidentally, in 2021, as part of me trying to improve my health/diet, I tried some thing that is not important for this post and for the first time in a long time, I was fully, 100%, pain free in my leg during day-to-day activities. Huh, maybe this is not purely related to running? From that point on, my foot became, very slowly, better. I started doing short runs (2-3km), especially on holidays where I can t bike, and if I was careful, it didn t go too bad. But I knew I can t run, so these were rare events. In April this year, on vacation, I run a couple of times - 20km distance. In May, 12km. Then, there was a Garmin Badge I really wanted, so against my good judgement, I did a run/walk (2:1 ratio) the previous weekend, and to my surprise, no unwanted side-effect. And I got an idea: what if I do short run/walks an entire week? When does my foot break ? I mean, by now I knew that a short (3-4, maybe 5km) run that has pauses doesn t negatively impact my foot. What about the 2nd one? Or the 3rd one? When does it break? Is it distance, or something else? The other problem was - when to run? I mean, on top of hybrid work model. When working from home, all good, but when working from the office? So the other, somewhat more impossible task for me, was to wake up early and run before 8 AM. Clearly destined to fail! But, the following day (Monday), I did wake up and 3km. Then Tuesday again, 3.3km (and later, one hour of biking). Wed - 3.3km. Thu - 4.40km, at 4:1 ratio (2m:30s). Friday, 3.7km (4:1), plus a very long for me (112km) bike ride. By this time, I was physically dead. Not my foot, just my entire body. On Saturday morning, Training Peaks said my form is -52, and it starts warning below -15. I woke up late and groggy, and I had to extra motivate myself to go for the last, 5.3km run, to round up the week. On Friday and Saturday, my problem leg did start to how to say, remind me it is problematic? But not like previously, no waking in the morning with a stiff tendon. No, just not fully happy. And, to my surprise, correlated again with my consumption of problematic food (I was getting hungrier and hungrier, and eating too much of things I should keep an eye on). At this point, with the week behind me: Did my experiment make me wiser? Not really. Happier? Yes, 100%. I plan to buy some new running clothes, my current ones are really old. But did I really understand how my body function? A loud no. Sigh. The next challenge will be, how to manage my time across multiple sports (and work, and family, and other hobbies). Still, knowing that I can anytime go for 25-35 minutes of running, without preparation, is very reassuring. Freedom, health and injury-free sports to everyone!

17 June 2022

Antoine Beaupr : Matrix notes

I have some concerns about Matrix (the protocol, not the movie that came out recently, although I do have concerns about that as well). I've been watching the project for a long time, and it seems more a promising alternative to many protocols like IRC, XMPP, and Signal. This review may sound a bit negative, because it focuses on those concerns. I am the operator of an IRC network and people keep asking me to bridge it with Matrix. I have myself considered just giving up on IRC and converting to Matrix. This space is a living document exploring my research of that problem space. The TL;DR: is that no, I'm not setting up a bridge just yet, and I'm still on IRC. This article was written over the course of the last three months, but I have been watching the Matrix project for years (my logs seem to say 2016 at least). The article is rather long. It will likely take you half an hour to read, so copy this over to your ebook reader, your tablet, or dead trees, and lean back and relax as I show you around the Matrix. Or, alternatively, just jump to a section that interest you, most likely the conclusion.

Introduction to Matrix Matrix is an "open standard for interoperable, decentralised, real-time communication over IP. It can be used to power Instant Messaging, VoIP/WebRTC signalling, Internet of Things communication - or anywhere you need a standard HTTP API for publishing and subscribing to data whilst tracking the conversation history". It's also (when compared with XMPP) "an eventually consistent global JSON database with an HTTP API and pubsub semantics - whilst XMPP can be thought of as a message passing protocol." According to their FAQ, the project started in 2014, has about 20,000 servers, and millions of users. Matrix works over HTTPS but over a special port: 8448.

Security and privacy I have some concerns about the security promises of Matrix. It's advertised as a "secure" with "E2E [end-to-end] encryption", but how does it actually work?

Data retention defaults One of my main concerns with Matrix is data retention, which is a key part of security in a threat model where (for example) an hostile state actor wants to surveil your communications and can seize your devices. On IRC, servers don't actually keep messages all that long: they pass them along to other servers and clients as fast as they can, only keep them in memory, and move on to the next message. There are no concerns about data retention on messages (and their metadata) other than the network layer. (I'm ignoring the issues with user registration, which is a separate, if valid, concern.) Obviously, an hostile server could log everything passing through it, but IRC federations are normally tightly controlled. So, if you trust your IRC operators, you should be fairly safe. Obviously, clients can (and often do, even if OTR is configured!) log all messages, but this is generally not the default. Irssi, for example, does not log by default. IRC bouncers are more likely to log to disk, of course, to be able to do what they do. Compare this to Matrix: when you send a message to a Matrix homeserver, that server first stores it in its internal SQL database. Then it will transmit that message to all clients connected to that server and room, and to all other servers that have clients connected to that room. Those remote servers, in turn, will keep a copy of that message and all its metadata in their own database, by default forever. On encrypted rooms those messages are encrypted, but not their metadata. There is a mechanism to expire entries in Synapse, but it is not enabled by default. So one should generally assume that a message sent on Matrix is never expired.

GDPR in the federation But even if that setting was enabled by default, how do you control it? This is a fundamental problem of the federation: if any user is allowed to join a room (which is the default), those user's servers will log all content and metadata from that room. That includes private, one-on-one conversations, since those are essentially rooms as well. In the context of the GDPR, this is really tricky: who is the responsible party (known as the "data controller") here? It's basically any yahoo who fires up a home server and joins a room. In a federated network, one has to wonder whether GDPR enforcement is even possible at all. But in Matrix in particular, if you want to enforce your right to be forgotten in a given room, you would have to:
  1. enumerate all the users that ever joined the room while you were there
  2. discover all their home servers
  3. start a GDPR procedure against all those servers
I recognize this is a hard problem to solve while still keeping an open ecosystem. But I believe that Matrix should have much stricter defaults towards data retention than right now. Message expiry should be enforced by default, for example. (Note that there are also redaction policies that could be used to implement part of the GDPR automatically, see the privacy policy discussion below on that.) Also keep in mind that, in the brave new peer-to-peer world that Matrix is heading towards, the boundary between server and client is likely to be fuzzier, which would make applying the GDPR even more difficult. Update: this comment links to this post (in german) which apparently studied the question and concluded that Matrix is not GDPR-compliant. In fact, maybe Synapse should be designed so that there's no configurable flag to turn off data retention. A bit like how most system loggers in UNIX (e.g. syslog) come with a log retention system that typically rotate logs after a few weeks or month. Historically, this was designed to keep hard drives from filling up, but it also has the added benefit of limiting the amount of personal information kept on disk in this modern day. (Arguably, syslog doesn't rotate logs on its own, but, say, Debian GNU/Linux, as an installed system, does have log retention policies well defined for installed packages, and those can be discussed. And "no expiry" is definitely a bug.

Matrix.org privacy policy When I first looked at Matrix, five years ago, Element.io was called Riot.im and had a rather dubious privacy policy:
We currently use cookies to support our use of Google Analytics on the Website and Service. Google Analytics collects information about how you use the Website and Service. [...] This helps us to provide you with a good experience when you browse our Website and use our Service and also allows us to improve our Website and our Service.
When I asked Matrix people about why they were using Google Analytics, they explained this was for development purposes and they were aiming for velocity at the time, not privacy (paraphrasing here). They also included a "free to snitch" clause:
If we are or believe that we are under a duty to disclose or share your personal data, we will do so in order to comply with any legal obligation, the instructions or requests of a governmental authority or regulator, including those outside of the UK.
Those are really broad terms, above and beyond what is typically expected legally. Like the current retention policies, such user tracking and ... "liberal" collaboration practices with the state set a bad precedent for other home servers. Thankfully, since the above policy was published (2017), the GDPR was "implemented" (2018) and it seems like both the Element.io privacy policy and the Matrix.org privacy policy have been somewhat improved since. Notable points of the new privacy policies:
  • 2.3.1.1: the "federation" section actually outlines that "Federated homeservers and Matrix clients which respect the Matrix protocol are expected to honour these controls and redaction/erasure requests, but other federated homeservers are outside of the span of control of Element, and we cannot guarantee how this data will be processed"
  • 2.6: users under the age of 16 should not use the matrix.org service
  • 2.10: Upcloud, Mythic Beast, Amazon, and CloudFlare possibly have access to your data (it's nice to at least mention this in the privacy policy: many providers don't even bother admitting to this kind of delegation)
  • Element 2.2.1: mentions many more third parties (Twilio, Stripe, Quaderno, LinkedIn, Twitter, Google, Outplay, PipeDrive, HubSpot, Posthog, Sentry, and Matomo (phew!) used when you are paying Matrix.org for hosting
I'm not super happy with all the trackers they have on the Element platform, but then again you don't have to use that service. Your favorite homeserver (assuming you are not on Matrix.org) probably has their own Element deployment, hopefully without all that garbage. Overall, this is all a huge improvement over the previous privacy policy, so hats off to the Matrix people for figuring out a reasonable policy in such a tricky context. I particularly like this bit:
We will forget your copy of your data upon your request. We will also forward your request to be forgotten onto federated homeservers. However - these homeservers are outside our span of control, so we cannot guarantee they will forget your data.
It's great they implemented those mechanisms and, after all, if there's an hostile party in there, nothing can prevent them from using screenshots to just exfiltrate your data away from the client side anyways, even with services typically seen as more secure, like Signal. As an aside, I also appreciate that Matrix.org has a fairly decent code of conduct, based on the TODO CoC which checks all the boxes in the geekfeminism wiki.

Metadata handling Overall, privacy protections in Matrix mostly concern message contents, not metadata. In other words, who's talking with who, when and from where is not well protected. Compared to a tool like Signal, which goes through great lengths to anonymize that data with features like private contact discovery, disappearing messages, sealed senders, and private groups, Matrix is definitely behind. (Note: there is an issue open about message lifetimes in Element since 2020, but it's not at even at the MSC stage yet.) This is a known issue (opened in 2019) in Synapse, but this is not just an implementation issue, it's a flaw in the protocol itself. Home servers keep join/leave of all rooms, which gives clear text information about who is talking to. Synapse logs may also contain privately identifiable information that home server admins might not be aware of in the first place. Those log rotation policies are separate from the server-level retention policy, which may be confusing for a novice sysadmin. Combine this with the federation: even if you trust your home server to do the right thing, the second you join a public room with third-party home servers, those ideas kind of get thrown out because those servers can do whatever they want with that information. Again, a problem that is hard to solve in any federation. To be fair, IRC doesn't have a great story here either: any client knows not only who's talking to who in a room, but also typically their client IP address. Servers can (and often do) obfuscate this, but often that obfuscation is trivial to reverse. Some servers do provide "cloaks" (sometimes automatically), but that's kind of a "slap-on" solution that actually moves the problem elsewhere: now the server knows a little more about the user. Overall, I would worry much more about a Matrix home server seizure than a IRC or Signal server seizure. Signal does get subpoenas, and they can only give out a tiny bit of information about their users: their phone number, and their registration, and last connection date. Matrix carries a lot more information in its database.

Amplification attacks on URL previews I (still!) run an Icecast server and sometimes share links to it on IRC which, obviously, also ends up on (more than one!) Matrix home servers because some people connect to IRC using Matrix. This, in turn, means that Matrix will connect to that URL to generate a link preview. I feel this outlines a security issue, especially because those sockets would be kept open seemingly forever. I tried to warn the Matrix security team but somehow, I don't think this issue was taken very seriously. Here's the disclosure timeline:
  • January 18: contacted Matrix security
  • January 19: response: already reported as a bug
  • January 20: response: can't reproduce
  • January 31: timeout added, considered solved
  • January 31: I respond that I believe the security issue is underestimated, ask for clearance to disclose
  • February 1: response: asking for two weeks delay after the next release (1.53.0) including another patch, presumably in two weeks' time
  • February 22: Matrix 1.53.0 released
  • April 14: I notice the release, ask for clearance again
  • April 14: response: referred to the public disclosure
There are a couple of problems here:
  1. the bug was publicly disclosed in September 2020, and not considered a security issue until I notified them, and even then, I had to insist
  2. no clear disclosure policy timeline was proposed or seems established in the project (there is a security disclosure policy but it doesn't include any predefined timeline)
  3. I wasn't informed of the disclosure
  4. the actual solution is a size limit (10MB, already implemented), a time limit (30 seconds, implemented in PR 11784), and a content type allow list (HTML, "media" or JSON, implemented in PR 11936), and I'm not sure it's adequate
  5. (pure vanity:) I did not make it to their Hall of fame
I'm not sure those solutions are adequate because they all seem to assume a single home server will pull that one URL for a little while then stop. But in a federated network, many (possibly thousands) home servers may be connected in a single room at once. If an attacker drops a link into such a room, all those servers would connect to that link all at once. This is an amplification attack: a small amount of traffic will generate a lot more traffic to a single target. It doesn't matter there are size or time limits: the amplification is what matters here. It should also be noted that clients that generate link previews have more amplification because they are more numerous than servers. And of course, the default Matrix client (Element) does generate link previews as well. That said, this is possibly not a problem specific to Matrix: any federated service that generates link previews may suffer from this. I'm honestly not sure what the solution is here. Maybe moderation? Maybe link previews are just evil? All I know is there was this weird bug in my Icecast server and I tried to ring the bell about it, and it feels it was swept under the rug. Somehow I feel this is bound to blow up again in the future, even with the current mitigation.

Moderation In Matrix like elsewhere, Moderation is a hard problem. There is a detailed moderation guide and much of this problem space is actively worked on in Matrix right now. A fundamental problem with moderating a federated space is that a user banned from a room can rejoin the room from another server. This is why spam is such a problem in Email, and why IRC networks have stopped federating ages ago (see the IRC history for that fascinating story).

The mjolnir bot The mjolnir moderation bot is designed to help with some of those things. It can kick and ban users, redact all of a user's message (as opposed to one by one), all of this across multiple rooms. It can also subscribe to a federated block list published by matrix.org to block known abusers (users or servers). Bans are pretty flexible and can operate at the user, room, or server level. Matrix people suggest making the bot admin of your channels, because you can't take back admin from a user once given.

The command-line tool There's also a new command line tool designed to do things like:
  • System notify users (all users/users from a list, specific user)
  • delete sessions/devices not seen for X days
  • purge the remote media cache
  • select rooms with various criteria (external/local/empty/created by/encrypted/cleartext)
  • purge history of theses rooms
  • shutdown rooms
This tool and Mjolnir are based on the admin API built into Synapse.

Rate limiting Synapse has pretty good built-in rate-limiting which blocks repeated login, registration, joining, or messaging attempts. It may also end up throttling servers on the federation based on those settings.

Fundamental federation problems Because users joining a room may come from another server, room moderators are at the mercy of the registration and moderation policies of those servers. Matrix is like IRC's +R mode ("only registered users can join") by default, except that anyone can register their own homeserver, which makes this limited. Server admins can block IP addresses and home servers, but those tools are not easily available to room admins. There is an API (m.room.server_acl in /devtools) but it is not reliable (thanks Austin Huang for the clarification). Matrix has the concept of guest accounts, but it is not used very much, and virtually no client or homeserver supports it. This contrasts with the way IRC works: by default, anyone can join an IRC network even without authentication. Some channels require registration, but in general you are free to join and look around (until you get blocked, of course). I have seen anecdotal evidence (CW: Twitter, nitter link) that "moderating bridges is hell", and I can imagine why. Moderation is already hard enough on one federation, when you bridge a room with another network, you inherit all the problems from that network but without the entire abuse control tools from the original network's API...

Room admins Matrix, in particular, has the problem that room administrators (which have the power to redact messages, ban users, and promote other users) are bound to their Matrix ID which is, in turn, bound to their home servers. This implies that a home server administrators could (1) impersonate a given user and (2) use that to hijack the room. So in practice, the home server is the trust anchor for rooms, not the user themselves. That said, if server B administrator hijack user joe on server B, they will hijack that room on that specific server. This will not (necessarily) affect users on the other servers, as servers could refuse parts of the updates or ban the compromised account (or server). It does seem like a major flaw that room credentials are bound to Matrix identifiers, as opposed to the E2E encryption credentials. In an encrypted room even with fully verified members, a compromised or hostile home server can still take over the room by impersonating an admin. That admin (or even a newly minted user) can then send events or listen on the conversations. This is even more frustrating when you consider that Matrix events are actually signed and therefore have some authentication attached to them, acting like some sort of Merkle tree (as it contains a link to previous events). That signature, however, is made from the homeserver PKI keys, not the client's E2E keys, which makes E2E feel like it has been "bolted on" later.

Availability While Matrix has a strong advantage over Signal in that it's decentralized (so anyone can run their own homeserver,), I couldn't find an easy way to run a "multi-primary" setup, or even a "redundant" setup (even if with a single primary backend), short of going full-on "replicate PostgreSQL and Redis data", which is not typically for the faint of heart.

How this works in IRC On IRC, it's quite easy to setup redundant nodes. All you need is:
  1. a new machine (with it's own public address with an open port)
  2. a shared secret (or certificate) between that machine and an existing one on the network
  3. a connect block on both servers
That's it: the node will join the network and people can connect to it as usual and share the same user/namespace as the rest of the network. The servers take care of synchronizing state: you do not need to worry about replicating a database server. (Now, experienced IRC people will know there's a catch here: IRC doesn't have authentication built in, and relies on "services" which are basically bots that authenticate users (I'm simplifying, don't nitpick). If that service goes down, the network still works, but then people can't authenticate, and they can start doing nasty things like steal people's identity if they get knocked offline. But still: basic functionality still works: you can talk in rooms and with users that are on the reachable network.)

User identities Matrix is more complicated. Each "home server" has its own identity namespace: a specific user (say @anarcat:matrix.org) is bound to that specific home server. If that server goes down, that user is completely disconnected. They could register a new account elsewhere and reconnect, but then they basically lose all their configuration: contacts, joined channels are all lost. (Also notice how the Matrix IDs don't look like a typical user address like an email in XMPP. They at least did their homework and got the allocation for the scheme.)

Rooms Users talk to each other in "rooms", even in one-to-one communications. (Rooms are also used for other things like "spaces", they're basically used for everything, think "everything is a file" kind of tool.) For rooms, home servers act more like IRC nodes in that they keep a local state of the chat room and synchronize it with other servers. Users can keep talking inside a room if the server that originally hosts the room goes down. Rooms can have a local, server-specific "alias" so that, say, #room:matrix.org is also visible as #room:example.com on the example.com home server. Both addresses refer to the same room underlying room. (Finding this in the Element settings is not obvious though, because that "alias" are actually called a "local address" there. So to create such an alias (in Element), you need to go in the room settings' "General" section, "Show more" in "Local address", then add the alias name (e.g. foo), and then that room will be available on your example.com homeserver as #foo:example.com.) So a room doesn't belong to a server, it belongs to the federation, and anyone can join the room from any serer (if the room is public, or if invited otherwise). You can create a room on server A and when a user from server B joins, the room will be replicated on server B as well. If server A fails, server B will keep relaying traffic to connected users and servers. A room is therefore not fundamentally addressed with the above alias, instead ,it has a internal Matrix ID, which basically a random string. It has a server name attached to it, but that was made just to avoid collisions. That can get a little confusing. For example, the #fractal:gnome.org room is an alias on the gnome.org server, but the room ID is !hwiGbsdSTZIwSRfybq:matrix.org. That's because the room was created on matrix.org, but the preferred branding is gnome.org now. As an aside, rooms, by default, live forever, even after the last user quits. There's an admin API to delete rooms and a tombstone event to redirect to another one, but neither have a GUI yet. The latter is part of MSC1501 ("Room version upgrades") which allows a room admin to close a room, with a message and a pointer to another room.

Spaces Discovering rooms can be tricky: there is a per-server room directory, but Matrix.org people are trying to deprecate it in favor of "Spaces". Room directories were ripe for abuse: anyone can create a room, so anyone can show up in there. It's possible to restrict who can add aliases, but anyways directories were seen as too limited. In contrast, a "Space" is basically a room that's an index of other rooms (including other spaces), so existing moderation and administration mechanism that work in rooms can (somewhat) work in spaces as well. This enables a room directory that works across federation, regardless on which server they were originally created. New users can be added to a space or room automatically in Synapse. (Existing users can be told about the space with a server notice.) This gives admins a way to pre-populate a list of rooms on a server, which is useful to build clusters of related home servers, providing some sort of redundancy, at the room -- not user -- level.

Home servers So while you can workaround a home server going down at the room level, there's no such thing at the home server level, for user identities. So if you want those identities to be stable in the long term, you need to think about high availability. One limitation is that the domain name (e.g. matrix.example.com) must never change in the future, as renaming home servers is not supported. The documentation used to say you could "run a hot spare" but that has been removed. Last I heard, it was not possible to run a high-availability setup where multiple, separate locations could replace each other automatically. You can have high performance setups where the load gets distributed among workers, but those are based on a shared database (Redis and PostgreSQL) backend. So my guess is it would be possible to create a "warm" spare server of a matrix home server with regular PostgreSQL replication, but that is not documented in the Synapse manual. This sort of setup would also not be useful to deal with networking issues or denial of service attacks, as you will not be able to spread the load over multiple network locations easily. Redis and PostgreSQL heroes are welcome to provide their multi-primary solution in the comments. In the meantime, I'll just point out this is a solution that's handled somewhat more gracefully in IRC, by having the possibility of delegating the authentication layer.

Delegations If you do not want to run a Matrix server yourself, it's possible to delegate the entire thing to another server. There's a server discovery API which uses the .well-known pattern (or SRV records, but that's "not recommended" and a bit confusing) to delegate that service to another server. Be warned that the server still needs to be explicitly configured for your domain. You can't just put:
  "m.server": "matrix.org:443"  
... on https://example.com/.well-known/matrix/server and start using @you:example.com as a Matrix ID. That's because Matrix doesn't support "virtual hosting" and you'd still be connecting to rooms and people with your matrix.org identity, not example.com as you would normally expect. This is also why you cannot rename your home server. The server discovery API is what allows servers to find each other. Clients, on the other hand, use the client-server discovery API: this is what allows a given client to find your home server when you type your Matrix ID on login.

Performance The high availability discussion brushed over the performance of Matrix itself, but let's now dig into that.

Horizontal scalability There were serious scalability issues of the main Matrix server, Synapse, in the past. So the Matrix team has been working hard to improve its design. Since Synapse 1.22 the home server can horizontally scale to multiple workers (see this blog post for details) which can make it easier to scale large servers.

Other implementations There are other promising home servers implementations from a performance standpoint (dendrite, Golang, entered beta in late 2020; conduit, Rust, beta; others), but none of those are feature-complete so there's a trade-off to be made there. Synapse is also adding a lot of feature fast, so it's an open question whether the others will ever catch up. (I have heard that Dendrite might actually surpass Synapse in features within a few years, which would put Synapse in a more "LTS" situation.)

Latency Matrix can feel slow sometimes. For example, joining the "Matrix HQ" room in Element (from matrix.debian.social) takes a few minutes and then fails. That is because the home server has to sync the entire room state when you join the room. There was promising work on this announced in the lengthy 2021 retrospective, and some of that work landed (partial sync) in the 1.53 release already. Other improvements coming include sliding sync, lazy loading over federation, and fast room joins. So that's actually something that could be fixed in the fairly short term. But in general, communication in Matrix doesn't feel as "snappy" as on IRC or even Signal. It's hard to quantify this without instrumenting a full latency test bed (for example the tools I used in the terminal emulators latency tests), but even just typing in a web browser feels slower than typing in a xterm or Emacs for me. Even in conversations, I "feel" people don't immediately respond as fast. In fact, this could be an interesting double-blind experiment to make: have people guess whether they are talking to a person on Matrix, XMPP, or IRC, for example. My theory would be that people could notice that Matrix users are slower, if only because of the TCP round-trip time each message has to take.

Transport Some courageous person actually made some tests of various messaging platforms on a congested network. His evaluation was basically:
  • Briar: uses Tor, so unusable except locally
  • Matrix: "struggled to send and receive messages", joining a room takes forever as it has to sync all history, "took 20-30 seconds for my messages to be sent and another 20 seconds for further responses"
  • XMPP: "worked in real-time, full encryption, with nearly zero lag"
So that was interesting. I suspect IRC would have also fared better, but that's just a feeling. Other improvements to the transport layer include support for websocket and the CoAP proxy work from 2019 (targeting 100bps links), but both seem stalled at the time of writing. The Matrix people have also announced the pinecone p2p overlay network which aims at solving large, internet-scale routing problems. See also this talk at FOSDEM 2022.

Usability

Onboarding and workflow The workflow for joining a room, when you use Element web, is not great:
  1. click on a link in a web browser
  2. land on (say) https://matrix.to/#/#matrix-dev:matrix.org
  3. offers "Element", yeah that's sounds great, let's click "Continue"
  4. land on https://app.element.io/#/room%2F%23matrix-dev%3Amatrix.org and then you need to register, aaargh
As you might have guessed by now, there is a specification to solve this, but web browsers need to adopt it as well, so that's far from actually being solved. At least browsers generally know about the matrix: scheme, it's just not exactly clear what they should do with it, especially when the handler is just another web page (e.g. Element web). In general, when compared with tools like Signal or WhatsApp, Matrix doesn't fare so well in terms of user discovery. I probably have some of my normal contacts that have a Matrix account as well, but there's really no way to know. It's kind of creepy when Signal tells you "this person is on Signal!" but it's also pretty cool that it works, and they actually implemented it pretty well. Registration is also less obvious: in Signal, the app confirms your phone number automatically. It's friction-less and quick. In Matrix, you need to learn about home servers, pick one, register (with a password! aargh!), and then setup encryption keys (not default), etc. It's a lot more friction. And look, I understand: giving away your phone number is a huge trade-off. I don't like it either. But it solves a real problem and makes encryption accessible to a ton more people. Matrix does have "identity servers" that can serve that purpose, but I don't feel confident sharing my phone number there. It doesn't help that the identity servers don't have private contact discovery: giving them your phone number is a more serious security compromise than with Signal. There's a catch-22 here too: because no one feels like giving away their phone numbers, no one does, and everyone assumes that stuff doesn't work anyways. Like it or not, Signal forcing people to divulge their phone number actually gives them critical mass that means actually a lot of my relatives are on Signal and I don't have to install crap like WhatsApp to talk with them.

5 minute clients evaluation Throughout all my tests I evaluated a handful of Matrix clients, mostly from Flathub because almost none of them are packaged in Debian. Right now I'm using Element, the flagship client from Matrix.org, in a web browser window, with the PopUp Window extension. This makes it look almost like a native app, and opens links in my main browser window (instead of a new tab in that separate window), which is nice. But I'm tired of buying memory to feed my web browser, so this indirection has to stop. Furthermore, I'm often getting completely logged off from Element, which means re-logging in, recovering my security keys, and reconfiguring my settings. That is extremely annoying. Coming from Irssi, Element is really "GUI-y" (pronounced "gooey"). Lots of clickety happening. To mark conversations as read, in particular, I need to click-click-click on all the tabs that have some activity. There's no "jump to latest message" or "mark all as read" functionality as far as I could tell. In Irssi the former is built-in (alt-a) and I made a custom /READ command for the latter:
/ALIAS READ script exec \$_->activity(0) for Irssi::windows
And yes, that's a Perl script in my IRC client. I am not aware of any Matrix client that does stuff like that, except maybe Weechat, if we can call it a Matrix client, or Irssi itself, now that it has a Matrix plugin (!). As for other clients, I have looked through the Matrix Client Matrix (confusing right?) to try to figure out which one to try, and, even after selecting Linux as a filter, the chart is just too wide to figure out anything. So I tried those, kind of randomly:
  • Fractal
  • Mirage
  • Nheko
  • Quaternion
Unfortunately, I lost my notes on those, I don't actually remember which one did what. I still have a session open with Mirage, so I guess that means it's the one I preferred, but I remember they were also all very GUI-y. Maybe I need to look at weechat-matrix or gomuks. At least Weechat is scriptable so I could continue playing the power-user. Right now my strategy with messaging (and that includes microblogging like Twitter or Mastodon) is that everything goes through my IRC client, so Weechat could actually fit well in there. Going with gomuks, on the other hand, would mean running it in parallel with Irssi or ... ditching IRC, which is a leap I'm not quite ready to take just yet. Oh, and basically none of those clients (except Nheko and Element) support VoIP, which is still kind of a second-class citizen in Matrix. It does not support large multimedia rooms, for example: Jitsi was used for FOSDEM instead of the native videoconferencing system.

Bots This falls a little aside the "usability" section, but I didn't know where to put this... There's a few Matrix bots out there, and you are likely going to be able to replace your existing bots with Matrix bots. It's true that IRC has a long and impressive history with lots of various bots doing various things, but given how young Matrix is, there's still a good variety:
  • maubot: generic bot with tons of usual plugins like sed, dice, karma, xkcd, echo, rss, reminder, translate, react, exec, gitlab/github webhook receivers, weather, etc
  • opsdroid: framework to implement "chat ops" in Matrix, connects with Matrix, GitHub, GitLab, Shell commands, Slack, etc
  • matrix-nio: another framework, used to build lots more bots like:
    • hemppa: generic bot with various functionality like weather, RSS feeds, calendars, cron jobs, OpenStreetmaps lookups, URL title snarfing, wolfram alpha, astronomy pic of the day, Mastodon bridge, room bridging, oh dear
    • devops: ping, curl, etc
    • podbot: play podcast episodes from AntennaPod
    • cody: Python, Ruby, Javascript REPL
    • eno: generic bot, "personal assistant"
  • mjolnir: moderation bot
  • hookshot: bridge with GitLab/GitHub
  • matrix-monitor-bot: latency monitor
One thing I haven't found an equivalent for is Debian's MeetBot. There's an archive bot but it doesn't have topics or a meeting chair, or HTML logs.

Working on Matrix As a developer, I find Matrix kind of intimidating. The specification is huge. The official specification itself looks somewhat digestable: it's only 6 APIs so that looks, at first, kind of reasonable. But whenever you start asking complicated questions about Matrix, you quickly fall into the Matrix Spec Change specification (which, yes, is a separate specification). And there are literally hundreds of MSCs flying around. It's hard to tell what's been adopted and what hasn't, and even harder to figure out if your specific client has implemented it. (One trendy answer to this problem is to "rewrite it in rust": Matrix are working on implementing a lot of those specifications in a matrix-rust-sdk that's designed to take the implementation details away from users.) Just taking the latest weekly Matrix report, you find that three new MSCs proposed, just last week! There's even a graph that shows the number of MSCs is progressing steadily, at 600+ proposals total, with the majority (300+) "new". I would guess the "merged" ones are at about 150. That's a lot of text which includes stuff like 3D worlds which, frankly, I don't think you should be working on when you have such important security and usability problems. (The internet as a whole, arguably, doesn't fare much better. RFC600 is a really obscure discussion about "INTERFACING AN ILLINOIS PLASMA TERMINAL TO THE ARPANET". Maybe that's how many MSCs will end up as well, left forgotten in the pits of history.) And that's the thing: maybe the Matrix people have a different objective than I have. They want to connect everything to everything, and make Matrix a generic transport for all sorts of applications, including virtual reality, collaborative editors, and so on. I just want secure, simple messaging. Possibly with good file transfers, and video calls. That it works with existing stuff is good, and it should be federated to remove the "Signal point of failure". So I'm a bit worried with the direction all those MSCs are taking, especially when you consider that clients other than Element are still struggling to keep up with basic features like end-to-end encryption or room discovery, never mind voice or spaces...

Conclusion Overall, Matrix is somehow in the space XMPP was a few years ago. It has a ton of features, pretty good clients, and a large community. It seems to have gained some of the momentum that XMPP has lost. It may have the most potential to replace Signal if something bad would happen to it (like, I don't know, getting banned or going nuts with cryptocurrency)... But it's really not there yet, and I don't see Matrix trying to get there either, which is a bit worrisome.

Looking back at history I'm also worried that we are repeating the errors of the past. The history of federated services is really fascinating:. IRC, FTP, HTTP, and SMTP were all created in the early days of the internet, and are all still around (except, arguably, FTP, which was removed from major browsers recently). All of them had to face serious challenges in growing their federation. IRC had numerous conflicts and forks, both at the technical level but also at the political level. The history of IRC is really something that anyone working on a federated system should study in detail, because they are bound to make the same mistakes if they are not familiar with it. The "short" version is:
  • 1988: Finnish researcher publishes first IRC source code
  • 1989: 40 servers worldwide, mostly universities
  • 1990: EFnet ("eris-free network") fork which blocks the "open relay", named Eris - followers of Eris form the A-net, which promptly dissolves itself, with only EFnet remaining
  • 1992: Undernet fork, which offered authentication ("services"), routing improvements and timestamp-based channel synchronisation
  • 1994: DALnet fork, from Undernet, again on a technical disagreement
  • 1995: Freenode founded
  • 1996: IRCnet forks from EFnet, following a flame war of historical proportion, splitting the network between Europe and the Americas
  • 1997: Quakenet founded
  • 1999: (XMPP founded)
  • 2001: 6 million users, OFTC founded
  • 2002: DALnet peaks at 136,000 users
  • 2003: IRC as a whole peaks at 10 million users, EFnet peaks at 141,000 users
  • 2004: (Facebook founded), Undernet peaks at 159,000 users
  • 2005: Quakenet peaks at 242,000 users, IRCnet peaks at 136,000 (Youtube founded)
  • 2006: (Twitter founded)
  • 2009: (WhatsApp, Pinterest founded)
  • 2010: (TextSecure AKA Signal, Instagram founded)
  • 2011: (Snapchat founded)
  • ~2013: Freenode peaks at ~100,000 users
  • 2016: IRCv3 standardisation effort started (TikTok founded)
  • 2021: Freenode self-destructs, Libera chat founded
  • 2022: Libera peaks at 50,000 users, OFTC peaks at 30,000 users
(The numbers were taken from the Wikipedia page and Netsplit.de. Note that I also include other networks launch in parenthesis for context.) Pretty dramatic, don't you think? Eventually, somehow, IRC became irrelevant for most people: few people are even aware of it now. With less than a million users active, it's smaller than Mastodon, XMPP, or Matrix at this point.1 If I were to venture a guess, I'd say that infighting, lack of a standardization body, and a somewhat annoying protocol meant the network could not grow. It's also possible that the decentralised yet centralised structure of IRC networks limited their reliability and growth. But large social media companies have also taken over the space: observe how IRC numbers peak around the time the wave of large social media companies emerge, especially Facebook (2.9B users!!) and Twitter (400M users).

Where the federated services are in history Right now, Matrix, and Mastodon (and email!) are at the "pre-EFnet" stage: anyone can join the federation. Mastodon has started working on a global block list of fascist servers which is interesting, but it's still an open federation. Right now, Matrix is totally open, but matrix.org publishes a (federated) block list of hostile servers (#matrix-org-coc-bl:matrix.org, yes, of course it's a room). Interestingly, Email is also in that stage, where there are block lists of spammers, and it's a race between those blockers and spammers. Large email providers, obviously, are getting closer to the EFnet stage: you could consider they only accept email from themselves or between themselves. It's getting increasingly hard to deliver mail to Outlook and Gmail for example, partly because of bias against small providers, but also because they are including more and more machine-learning tools to sort through email and those systems are, fundamentally, unknowable. It's not quite the same as splitting the federation the way EFnet did, but the effect is similar. HTTP has somehow managed to live in a parallel universe, as it's technically still completely federated: anyone can start a web server if they have a public IP address and anyone can connect to it. The catch, of course, is how you find the darn thing. Which is how Google became one of the most powerful corporations on earth, and how they became the gatekeepers of human knowledge online. I have only briefly mentioned XMPP here, and my XMPP fans will undoubtedly comment on that, but I think it's somewhere in the middle of all of this. It was co-opted by Facebook and Google, and both corporations have abandoned it to its fate. I remember fondly the days where I could do instant messaging with my contacts who had a Gmail account. Those days are gone, and I don't talk to anyone over Jabber anymore, unfortunately. And this is a threat that Matrix still has to face. It's also the threat Email is currently facing. On the one hand corporations like Facebook want to completely destroy it and have mostly succeeded: many people just have an email account to register on things and talk to their friends over Instagram or (lately) TikTok (which, I know, is not Facebook, but they started that fire). On the other hand, you have corporations like Microsoft and Google who are still using and providing email services because, frankly, you still do need email for stuff, just like fax is still around but they are more and more isolated in their own silo. At this point, it's only a matter of time they reach critical mass and just decide that the risk of allowing external mail coming in is not worth the cost. They'll simply flip the switch and work on an allow-list principle. Then we'll have closed the loop and email will be dead, just like IRC is "dead" now. I wonder which path Matrix will take. Could it liberate us from these vicious cycles? Update: this generated some discussions on lobste.rs.

  1. According to Wikipedia, there are currently about 500 distinct IRC networks operating, on about 1,000 servers, serving over 250,000 users. In contrast, Mastodon seems to be around 5 million users, Matrix.org claimed at FOSDEM 2021 to have about 28 million globally visible accounts, and Signal lays claim to over 40 million souls. XMPP claims to have "millions" of users on the xmpp.org homepage but the FAQ says they don't actually know. On the proprietary silo side of the fence, this page says
    • Facebook: 2.9 billion users
    • WhatsApp: 2B
    • Instagram: 1.4B
    • TikTok: 1B
    • Snapchat: 500M
    • Pinterest: 480M
    • Twitter: 397M
    Notable omission from that list: Youtube, with its mind-boggling 2.6 billion users... Those are not the kind of numbers you just "need to convince a brother or sister" to grow the network...

12 June 2022

Russ Allbery: Review: The Shattered Sphere

Review: The Shattered Sphere, by Roger MacBride Allen
Series: Hunted Earth #2
Publisher: Tor
Copyright: July 1994
Printing: September 1995
ISBN: 0-8125-3016-0
Format: Mass market
Pages: 491
The Shattered Sphere is a direct sequel to The Ring of Charon and spoils everything about the plot of the first book. You don't want to start here. Also be aware that essentially everything you can read about this book will spoil the major plot driver of The Ring of Charon in the first sentence. I'm going to review the book without doing that, but it's unlikely anyone else will try. The end of the previous book stabilized matters, but in no way resolved the plot. The Shattered Sphere opens five years later. Most of the characters from the first novel are joined by some new additions, and all of them are trying to make sense of a drastically changed and far more dangerous understanding of the universe. Humanity has a new enemy, one that's largely unaware of humanity's existence and able to operate on a scale that dwarfs human endeavors. The good news is that humans aren't being actively attacked. The bad news is that they may be little more than raw resources, stashed in a safe spot for future use. That is reason enough to worry. Worse are the hints of a far greater danger, one that may be capable of destruction on a scale nearly beyond human comprehension. Humanity may be trapped between a sophisticated enemy to whom human activity is barely more noticeable than ants, and a mysterious power that sends that enemy into an anxious panic. This series is an easily-recognized example of an in-between style of science fiction. It shares the conceptual bones of an earlier era of short engineer-with-a-wrench stories that are full of set pieces and giant constructs, but Allen attempts to add the characterization that those books lacked. But the technique isn't there; he's trying, and the basics of characterization are present, but with none of the emotional and descriptive sophistication of more recent SF. The result isn't bad, exactly, but it's bloated and belabored. Most of the characterization comes through repetition and ham-handed attempts at inner dialogue. Slow plotting doesn't help. Allen spends half of a nearly 500 page novel on setup in two primary threads. One is mostly people explaining detailed scientific theories to each other, mixed with an attempt at creating reader empathy that's more forceful than effective. The other is a sort of big dumb object exploration that failed to hold my attention and that turned out to be mostly irrelevant. Key revelations from that thread are revealed less by the actions of the characters than by dumping them on the reader in an extended monologue. The reading goes quickly, but only because the writing is predictable and light on interesting information, not because the plot is pulling the reader through the book. I found myself wishing for an earlier era that would have cut about 300 pages out of this book without losing any of the major events. Once things finally start happening, the book improves considerably. I grew up reading large-scale scientific puzzle stories, and I still have a soft spot for a last-minute scientific fix and dramatic set piece even if the descriptive detail leaves something to be desired. The last fifty pages are fast-moving and satisfying, only marred by their failure to convince me that the humans were required for the plot. The process of understanding alien technology well enough to use it the right way kept me entertained, but I don't understand why the aliens didn't use it themselves. I think this book falls between two stools. The scientific mysteries and set pieces would have filled a tight, fast-moving 200 page book with a minimum of characterization. It would have been a throwback to an earlier era of science fiction, but not a bad one. Allen instead wanted to provide a large cast of sympathetic and complex characters, and while I appreciate the continued lack of villains, the writing quality is not sufficient to the task. This isn't an awful book, but the quality bar in the genre is so much higher now. There are better investments of your reading time available today. Like The Ring of Charon, The Shattered Sphere reaches a satisfying conclusion but does not resolve the series plot. No sequel has been published, and at this point one seems unlikely to materialize. Rating: 5 out of 10

16 May 2022

Matthew Garrett: Can we fix bearer tokens?

Last month I wrote about how bearer tokens are just awful, and a week later Github announced that someone had managed to exfiltrate bearer tokens from Heroku that gave them access to, well, a lot of Github repositories. This has inevitably resulted in a whole bunch of discussion about a number of things, but people seem to be largely ignoring the fundamental issue that maybe we just shouldn't have magical blobs that grant you access to basically everything even if you've copied them from a legitimate holder to Honest John's Totally Legitimate API Consumer.

To make it clearer what the problem is here, let's use an analogy. You have a safety deposit box. To gain access to it, you simply need to be able to open it with a key you were given. Anyone who turns up with the key can open the box and do whatever they want with the contents. Unfortunately, the key is extremely easy to copy - anyone who is able to get hold of your keyring for a moment is in a position to duplicate it, and then they have access to the box. Wouldn't it be better if something could be done to ensure that whoever showed up with a working key was someone who was actually authorised to have that key?

To achieve that we need some way to verify the identity of the person holding the key. In the physical world we have a range of ways to achieve this, from simply checking whether someone has a piece of ID that associates them with the safety deposit box all the way up to invasive biometric measurements that supposedly verify that they're definitely the same person. But computers don't have passports or fingerprints, so we need another way to identify them.

When you open a browser and try to connect to your bank, the bank's website provides a TLS certificate that lets your browser know that you're talking to your bank instead of someone pretending to be your bank. The spec allows this to be a bi-directional transaction - you can also prove your identity to the remote website. This is referred to as "mutual TLS", or mTLS, and a successful mTLS transaction ends up with both ends knowing who they're talking to, as long as they have a reason to trust the certificate they were presented with.

That's actually a pretty big constraint! We have a reasonable model for the server - it's something that's issued by a trusted third party and it's tied to the DNS name for the server in question. Clients don't tend to have stable DNS identity, and that makes the entire thing sort of awkward. But, thankfully, maybe we don't need to? We don't need the client to be able to prove its identity to arbitrary third party sites here - we just need the client to be able to prove it's a legitimate holder of whichever bearer token it's presenting to that site. And that's a much easier problem.

Here's the simple solution - clients generate a TLS cert. This can be self-signed, because all we want to do here is be able to verify whether the machine talking to us is the same one that had a token issued to it. The client contacts a service that's going to give it a bearer token. The service requests mTLS auth without being picky about the certificate that's presented. The service embeds a hash of that certificate in the token before handing it back to the client. Whenever the client presents that token to any other service, the service ensures that the mTLS cert the client presented matches the hash in the bearer token. Copy the token without copying the mTLS certificate and the token gets rejected. Hurrah hurrah hats for everyone.

Well except for the obvious problem that if you're in a position to exfiltrate the bearer tokens you can probably just steal the client certificates and keys as well, and now you can pretend to be the original client and this is not adding much additional security. Fortunately pretty much everything we care about has the ability to store the private half of an asymmetric key in hardware (TPMs on Linux and Windows systems, the Secure Enclave on Macs and iPhones, either a piece of magical hardware or Trustzone on Android) in a way that avoids anyone being able to just steal the key.

How do we know that the key is actually in hardware? Here's the fun bit - it doesn't matter. If you're issuing a bearer token to a system then you're already asserting that the system is trusted. If the system is lying to you about whether or not the key it's presenting is hardware-backed then you've already lost. If it lied and the system is later compromised then sure all your apes get stolen, but maybe don't run systems that lie and avoid that situation as a result?

Anyway. This is covered in RFC 8705 so why aren't we all doing this already? From the client side, the largest generic issue is that TPMs are astonishingly slow in comparison to doing a TLS handshake on the CPU. RSA signing operations on TPMs can take around half a second, which doesn't sound too bad, except your browser is probably establishing multiple TLS connections to subdomains on the site it's connecting to and performance is going to tank. Fixing this involves doing whatever's necessary to convince the browser to pipe everything over a single TLS connection, and that's just not really where the web is right at the moment. Using EC keys instead helps a lot (~0.1 seconds per signature on modern TPMs), but it's still going to be a bottleneck.

The other problem, of course, is that ecosystem support for hardware-backed certificates is just awful. Windows lets you stick them into the standard platform certificate store, but the docs for this are hidden in a random PDF in a Github repo. Macs require you to do some weird bridging between the Secure Enclave API and the keychain API. Linux? Well, the standard answer is to do PKCS#11, and I have literally never met anybody who likes PKCS#11 and I have spent a bunch of time in standards meetings with the sort of people you might expect to like PKCS#11 and even they don't like it. It turns out that loading a bunch of random C bullshit that has strong feelings about function pointers into your security critical process is not necessarily something that is going to improve your quality of life, so instead you should use something like this and just have enough C to bridge to a language that isn't secretly plotting to kill your pets the moment you turn your back.

And, uh, obviously none of this matters at all unless people actually support it. Github has no support at all for validating the identity of whoever holds a bearer token. Most issuers of bearer tokens have no support for embedding holder identity into the token. This is not good! As of last week, all three of the big cloud providers support virtualised TPMs in their VMs - we should be running CI on systems that can do that, and tying any issued tokens to the VMs that are supposed to be making use of them.

So sure this isn't trivial. But it's also not impossible, and making this stuff work would improve the security of, well, everything. We literally have the technology to prevent attacks like Github suffered. What do we have to do to get people to actually start working on implementing that?

comment count unavailable comments

29 April 2022

Russ Allbery: Review: Interesting Times

Review: Interesting Times, by Terry Pratchett
Series: Discworld #17
Publisher: Harper
Copyright: 1994
Printing: February 2014
ISBN: 0-06-227629-8
Format: Mass market
Pages: 399
Interesting Times is the seventeenth Discworld novel and certainly not the place to start. At the least, you will probably want to read The Colour of Magic and The Light Fantastic before this book, since it's a sequel to those (although Rincewind has had some intervening adventures). Lord Vetinari has received a message from the Counterweight Continent, the first in ten years, cryptically demanding the Great Wizzard be sent immediately. The Agatean Empire is one of the most powerful states on the Disc. Thankfully for everyone else, it normally suits its rulers to believe that the lands outside their walls are inhabited only by ghosts. No one is inclined to try to change their minds or otherwise draw their attention. Accordingly, the Great Wizard must be sent, a task that Vetinari efficiently delegates to the Archchancellor. There is only the small matter of determining who the Great Wizzard is, and why it was spelled with two z's. Discworld readers with a better memory than I will recall Rincewind's hat. Why the Counterweight Continent would demanding a wizard notorious for his near-total inability to perform magic is a puzzle for other people. Rincewind is promptly located by a magical computer, and nearly as promptly transported across the Disc, swapping him for an unnecessarily exciting object of roughly equivalent mass and hurling him into an unexpected rescue of Cohen the Barbarian. Rincewind predictably reacts by running away, although not fast or far enough to keep him from being entangled in a glorious popular uprising. Or, well, something that has aspirations of being glorious, and popular, and an uprising. I hate to say this, because Pratchett is an ethically thoughtful writer to whom I am willing to give the benefit of many doubts, but this book was kind of racist. The Agatean Empire is modeled after China, and the Rincewind books tend to be the broadest and most obvious parodies, so that was already a recipe for some trouble. Some of the social parody is not too objectionable, albeit not my thing. I find ethnic stereotypes and making fun of funny-sounding names in other languages (like a city named Hunghung) to be in poor taste, but Pratchett makes fun of everyone's names and cultures rather equally. (Also, I admit that some of the water buffalo jokes, despite the stereotypes, were pretty good.) If it had stopped there, it would have prompted some eye-rolling but not much comment. Unfortunately, a significant portion of the plot depends on the idea that the population of the Agatean Empire has been so brainwashed into obedience that they have a hard time even imagining resistance, and even their revolutionaries are so polite that the best they can manage for slogans are things like "Timely Demise to All Enemies!" What they need are a bunch of outsiders, such as Rincewind or Cohen and his gang. More details would be spoilers, but there are several deliberate uses of Ankh-Morpork as a revolutionary inspiration and a great deal of narrative hand-wringing over how awful it is to so completely convince people they are slaves that you don't need chains. There is a depressingly tedious tendency of western writers, even otherwise thoughtful and well-meaning ones like Pratchett, to adopt a simplistic ranking of political systems on a crude measure of freedom. That analysis immediately encounters the problem that lots of people who live within systems that rate poorly on this one-dimensional scale seem inadequately upset about circumstances that are "obviously" horrific oppression. This should raise questions about the validity of the assumptions, but those assumptions are so unquestionable that the writer instead decides the people who are insufficiently upset about their lack of freedom must be defective. The more racist writers attribute that defectiveness to racial characteristics. The less racist writers, like Pratchett, attribute that defectiveness to brainwashing and systemic evil, which is not quite as bad as overt racism but still rests on a foundation of smug cultural superiority. Krister Stendahl, a bishop of the Church of Sweden, coined three famous rules for understanding other religions:
  1. When you are trying to understand another religion, you should ask the adherents of that religion and not its enemies.
  2. Don't compare your best to their worst.
  3. Leave room for "holy envy."
This is excellent advice that should also be applied to politics. Most systems exist for some reason. The differences from your preferred system are easy to see, particularly those that strike you as horrible. But often there are countervailing advantages that are less obvious, and those are more psychologically difficult to understand and objectively analyze. You might find they have something that you wish your system had, which causes discomfort if you're convinced you have the best political system in the world, or are making yourself feel better about the abuses of your local politics by assuring yourself that at least you're better than those people. I was particularly irritated to see this sort of simplistic stereotyping in Discworld given that Ankh-Morpork, the setting of most of the Discworld novels, is an authoritarian dictatorship. Vetinari quite capably maintains his hold on power, and yet this is not taken as a sign that the city's inhabitants have been brainwashed into considering themselves slaves. Instead, he's shown as adept at maintaining the stability of a precarious system with a lot of competing forces and a high potential for destructive chaos. Vetinari is an awful person, but he may be better than anyone who would replace him. Hmm. This sort of complexity is permitted in the "local" city, but as soon as we end up in an analog of China, the rulers are evil, the system lacks any justification, and the peasants only don't revolt because they've been trained to believe they can't. Gah. I was muttering about this all the way through Interesting Times, which is a shame because, outside of the ham-handed political plot, it has some great Pratchett moments. Rincewind's approach to any and all danger is a running (sorry) gag that keeps working, and Cohen and his gang of absurdly competent decrepit barbarians are both funnier here than they have been in any previous book and the rare highly-positive portrayal of old people in fantasy adventures who are not wizards or crones. Pretty Butterfly is a great character who deserved to be in a better plot. And I loved the trouble that Rincewind had with the Agatean tonal language, which is an excuse for Pratchett to write dialog full of frustrated non-sequiturs when Rincewind mispronounces a word. I do have to grumble about the Luggage, though. From a world-building perspective its subplot makes sense, but the Luggage was always the best character in the Rincewind stories, and the way it lost all of its specialness here was oddly sad and depressing. Pratchett also failed to convince me of the drastic retcon of The Colour of Magic and The Light Fantastic that he does here (and which I can't talk about in detail due to spoilers), in part because it's entangled in the orientalism of the plot. I'm not sure Pratchett could write a bad book, and I still enjoyed reading Interesting Times, but I don't think he gave the politics his normal care, attention, and thoughtful humanism. I hope later books in this part of the Disc add more nuance, and are less confident and judgmental. I can't really recommend this one, even though it has some merits. Also, just for the record, "may you live in interesting times" is not a Chinese curse. It's an English saying that likely was attributed to China to make it sound exotic, which is the sort of landmine that good-natured parody of other people's cultures needs to be wary of. Followed in publication order by Maskerade, and in Rincewind's personal timeline by The Last Continent. Rating: 6 out of 10

27 April 2022

Antoine Beaupr : building Debian packages under qemu with sbuild

I've been using sbuild for a while to build my Debian packages, mainly because it's what is used by the Debian autobuilders, but also because it's pretty powerful and efficient. Configuring it just right, however, can be a challenge. In my quick Debian development guide, I had a few pointers on how to configure sbuild with the normal schroot setup, but today I finished a qemu based configuration.

Why I want to use qemu mainly because it provides better isolation than a chroot. I sponsor packages sometimes and while I typically audit the source code before building, it still feels like the extra protection shouldn't hurt. I also like the idea of unifying my existing virtual machine setup with my build setup. My current VM is kind of all over the place: libvirt, vagrant, GNOME Boxes, etc?). I've been slowly converging over libvirt however, and most solutions I use right now rely on qemu under the hood, certainly not chroots... I could also have decided to go with containers like LXC, LXD, Docker (with conbuilder, whalebuilder, docker-buildpackage), systemd-nspawn (with debspawn), unshare (with schroot --chroot-mode=unshare), or whatever: I didn't feel those offer the level of isolation that is provided by qemu. The main downside of this approach is that it is (obviously) slower than native builds. But on modern hardware, that cost should be minimal.

How Basically, you need this:
sudo mkdir -p /srv/sbuild/qemu/
sudo apt install sbuild-qemu
sudo sbuild-qemu-create -o /srv/sbuild/qemu/unstable.img unstable https://deb.debian.org/debian
Then to make this used by default, add this to ~/.sbuildrc:
# run autopkgtest inside the schroot
$run_autopkgtest = 1;
# tell sbuild to use autopkgtest as a chroot
$chroot_mode = 'autopkgtest';
# tell autopkgtest to use qemu
$autopkgtest_virt_server = 'qemu';
# tell autopkgtest-virt-qemu the path to the image
# use --debug there to show what autopkgtest is doing
$autopkgtest_virt_server_options = [ '--', '/srv/sbuild/qemu/%r-%a.img' ];
# tell plain autopkgtest to use qemu, and the right image
$autopkgtest_opts = [ '--', 'qemu', '/srv/sbuild/qemu/%r-%a.img' ];
# no need to cleanup the chroot after build, we run in a completely clean VM
$purge_build_deps = 'never';
# no need for sudo
$autopkgtest_root_args = '';
Note that the above will use the default autopkgtest (1GB, one core) and qemu (128MB, one core) configuration, which might be a little low on resources. You probably want to be explicit about this, with something like this:
# extra parameters to pass to qemu
# --enable-kvm is not necessary, detected on the fly by autopkgtest
my @_qemu_options = ['--ram-size=4096', '--cpus=2'];
# tell autopkgtest-virt-qemu the path to the image
# use --debug there to show what autopkgtest is doing
$autopkgtest_virt_server_options = [ @_qemu_options, '--', '/srv/sbuild/qemu/%r-%a.img' ];
$autopkgtest_opts = [ '--', 'qemu', @qemu_options, '/srv/sbuild/qemu/%r-%a.img'];
This configuration will:
  1. create a virtual machine image in /srv/sbuild/qemu for unstable
  2. tell sbuild to use that image to create a temporary VM to build the packages
  3. tell sbuild to run autopkgtest (which should really be default)
  4. tell autopkgtest to use qemu for builds and for tests
Note that the VM created by sbuild-qemu-create have an unlocked root account with an empty password.

Other useful tasks
  • enter the VM to make test, changes will be discarded (thanks Nick Brown for the sbuild-qemu-boot tip!):
     sbuild-qemu-boot /srv/sbuild/qemu/unstable-amd64.img
    
    That program is shipped only with bookworm and later, an equivalent command is:
     qemu-system-x86_64 -snapshot -enable-kvm -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,id=rng-device0 -m 2048 -nographic /srv/sbuild/qemu/unstable-amd64.img
    
    The key argument here is -snapshot.
  • enter the VM to make permanent changes, which will not be discarded:
     sudo sbuild-qemu-boot --readwrite /srv/sbuild/qemu/unstable-amd64.img
    
    Equivalent command:
     sudo qemu-system-x86_64 -enable-kvm -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,id=rng-device0 -m 2048 -nographic /srv/sbuild/qemu/unstable-amd64.img
    
  • update the VM (thanks lavamind):
     sudo sbuild-qemu-update /srv/sbuild/qemu/unstable-amd64.img
    
  • build in a specific VM regardless of the suite specified in the changelog (e.g. UNRELEASED, bookworm-backports, bookworm-security, etc):
     sbuild --autopkgtest-virt-server-opts="-- qemu /var/lib/sbuild/qemu/bookworm-amd64.img"
    
    Note that you'd also need to pass --autopkgtest-opts if you want autopkgtest to run in the correct VM as well:
     sbuild --autopkgtest-opts="-- qemu /var/lib/sbuild/qemu/unstable.img" --autopkgtest-virt-server-opts="-- qemu /var/lib/sbuild/qemu/bookworm-amd64.img"
    
    You might also need parameters like --ram-size if you customized it above.
And yes, this is all quite complicated and could be streamlined a little, but that's what you get when you have years of legacy and just want to get stuff done. It seems to me autopkgtest-virt-qemu should have a magic flag starts a shell for you, but it doesn't look like that's a thing. When that program starts, it just says ok and sits there. Maybe because the authors consider the above to be simple enough (see also bug #911977 for a discussion of this problem).

Live access to a running test When autopkgtest starts a VM, it uses this funky qemu commandline:
qemu-system-x86_64 -m 4096 -smp 2 -nographic -net nic,model=virtio -net user,hostfwd=tcp:127.0.0.1:10022-:22 -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,id=rng-device0 -monitor unix:/tmp/autopkgtest-qemu.w1mlh54b/monitor,server,nowait -serial unix:/tmp/autopkgtest-qemu.w1mlh54b/ttyS0,server,nowait -serial unix:/tmp/autopkgtest-qemu.w1mlh54b/ttyS1,server,nowait -virtfs local,id=autopkgtest,path=/tmp/autopkgtest-qemu.w1mlh54b/shared,security_model=none,mount_tag=autopkgtest -drive index=0,file=/tmp/autopkgtest-qemu.w1mlh54b/overlay.img,cache=unsafe,if=virtio,discard=unmap,format=qcow2 -enable-kvm -cpu kvm64,+vmx,+lahf_lm
... which is a typical qemu commandline, I'm sorry to say. That gives us a VM with those settings (paths are relative to a temporary directory, /tmp/autopkgtest-qemu.w1mlh54b/ in the above example):
  • the shared/ directory is, well, shared with the VM
  • port 10022 is forward to the VM's port 22, presumably for SSH, but not SSH server is started by default
  • the ttyS1 and ttyS2 UNIX sockets are mapped to the first two serial ports (use nc -U to talk with those)
  • the monitor UNIX socket is a qemu control socket (see the QEMU monitor documentation, also nc -U)
In other words, it's possible to access the VM with:
nc -U /tmp/autopkgtest-qemu.w1mlh54b/ttyS2
The nc socket interface is ... not great, but it works well enough. And you can probably fire up an SSHd to get a better shell if you feel like it.

Nitty-gritty details no one cares about

Fixing hang in sbuild cleanup I'm having a hard time making heads or tails of this, but please bear with me. In sbuild + schroot, there's this notion that we don't really need to cleanup after ourselves inside the schroot, as the schroot will just be delted anyways. This behavior seems to be handled by the internal "Session Purged" parameter. At least in lib/Sbuild/Build.pm, we can see this:
my $is_cloned_session = (defined ($session->get('Session Purged')) &&
             $session->get('Session Purged') == 1) ? 1 : 0;
[...]
if ($is_cloned_session)  
$self->log("Not cleaning session: cloned chroot in use\n");
  else  
if ($purge_build_deps)  
    # Removing dependencies
    $resolver->uninstall_deps();
  else  
    $self->log("Not removing build depends: as requested\n");
 
 
The schroot builder defines that parameter as:
    $self->set('Session Purged', $info-> 'Session Purged' );
... which is ... a little confusing to me. $info is:
my $info = $self->get('Chroots')->get_info($schroot_session);
... so I presume that depends on whether the schroot was correctly cleaned up? I stopped digging there... ChrootUnshare.pm is way more explicit:
$self->set('Session Purged', 1);
I wonder if we should do something like this with the autopkgtest backend. I guess people might technically use it with something else than qemu, but qemu is the typical use case of the autopkgtest backend, in my experience. Or at least certainly with things that cleanup after themselves. Right? For some reason, before I added this line to my configuration:
$purge_build_deps = 'never';
... the "Cleanup" step would just completely hang. It was quite bizarre.

Disgression on the diversity of VM-like things There are a lot of different virtualization solutions one can use (e.g. Xen, KVM, Docker or Virtualbox). I have also found libguestfs to be useful to operate on virtual images in various ways. Libvirt and Vagrant are also useful wrappers on top of the above systems. There are particularly a lot of different tools which use Docker, Virtual machines or some sort of isolation stronger than chroot to build packages. Here are some of the alternatives I am aware of: Take, for example, Whalebuilder, which uses Docker to build packages instead of pbuilder or sbuild. Docker provides more isolation than a simple chroot: in whalebuilder, packages are built without network access and inside a virtualized environment. Keep in mind there are limitations to Docker's security and that pbuilder and sbuild do build under a different user which will limit the security issues with building untrusted packages. On the upside, some of things are being fixed: whalebuilder is now an official Debian package (whalebuilder) and has added the feature of passing custom arguments to dpkg-buildpackage. None of those solutions (except the autopkgtest/qemu backend) are implemented as a sbuild plugin, which would greatly reduce their complexity. I was previously using Qemu directly to run virtual machines, and had to create VMs by hand with various tools. This didn't work so well so I switched to using Vagrant as a de-facto standard to build development environment machines, but I'm returning to Qemu because it uses a similar backend as KVM and can be used to host longer-running virtual machines through libvirt. The great thing now is that autopkgtest has good support for qemu and sbuild has bridged the gap and can use it as a build backend. I originally had found those bugs in that setup, but all of them are now fixed:
  • #911977: sbuild: how do we correctly guess the VM name in autopkgtest?
  • #911979: sbuild: fails on chown in autopkgtest-qemu backend
  • #911963: autopkgtest qemu build fails with proxy_cmd: parameter not set
  • #911981: autopkgtest: qemu server warns about missing CPU features
So we have unification! It's possible to run your virtual machines and Debian builds using a single VM image backend storage, which is no small feat, in my humble opinion. See the sbuild-qemu blog post for the annoucement Now I just need to figure out how to merge Vagrant, GNOME Boxes, and libvirt together, which should be a matter of placing images in the right place... right? See also hosting.

pbuilder vs sbuild I was previously using pbuilder and switched in 2017 to sbuild. AskUbuntu.com has a good comparative between pbuilder and sbuild that shows they are pretty similar. The big advantage of sbuild is that it is the tool in use on the buildds and it's written in Perl instead of shell. My concerns about switching were POLA (I'm used to pbuilder), the fact that pbuilder runs as a separate user (works with sbuild as well now, if the _apt user is present), and setting up COW semantics in sbuild (can't just plug cowbuilder there, need to configure overlayfs or aufs, which was non-trivial in Debian jessie). Ubuntu folks, again, have more documentation there. Debian also has extensive documentation, especially about how to configure overlays. I was ultimately convinced by stapelberg's post on the topic which shows how much simpler sbuild really is...

Who Thanks lavamind for the introduction to the sbuild-qemu package.

5 April 2022

Kees Cook: security things in Linux v5.10

Previously: v5.9 Linux v5.10 was released in December, 2020. Here s my summary of various security things that I found interesting: AMD SEV-ES
While guest VM memory encryption with AMD SEV has been supported for a while, Joerg Roedel, Thomas Lendacky, and others added register state encryption (SEV-ES). This means it s even harder for a VM host to reconstruct a guest VM s state. x86 static calls
Josh Poimboeuf and Peter Zijlstra implemented static calls for x86, which operates very similarly to the static branch infrastructure in the kernel. With static branches, an if/else choice can be hard-coded, instead of being run-time evaluated every time. Such branches can be updated too (the kernel just rewrites the code to switch around the branch ). All these principles apply to static calls as well, but they re for replacing indirect function calls (i.e. a call through a function pointer) with a direct call (i.e. a hard-coded call address). This eliminates the need for Spectre mitigations (e.g. RETPOLINE) for these indirect calls, and avoids a memory lookup for the pointer. For hot-path code (like the scheduler), this has a measurable performance impact. It also serves as a kind of Control Flow Integrity implementation: an indirect call got removed, and the potential destinations have been explicitly identified at compile-time. network RNG improvements
In an effort to improve the pseudo-random number generator used by the network subsystem (for things like port numbers and packet sequence numbers), Linux s home-grown pRNG has been replaced by the SipHash round function, and perturbed by (hopefully) hard-to-predict internal kernel states. This should make it very hard to brute force the internal state of the pRNG and make predictions about future random numbers just from examining network traffic. Similarly, ICMP s global rate limiter was adjusted to avoid leaking details of network state, as a start to fixing recent DNS Cache Poisoning attacks. SafeSetID handles GID
Thomas Cedeno improved the SafeSetID LSM to handle group IDs (which required teaching the kernel about which syscalls were actually performing setgid.) Like the earlier setuid policy, this lets the system owner define an explicit list of allowed group ID transitions under CAP_SETGID (instead of to just any group), providing a way to keep the power of granting this capability much more limited. (This isn t complete yet, though, since handling setgroups() is still needed.) improve kernel s internal checking of file contents
The kernel provides LSMs (like the Integrity subsystem) with details about files as they re loaded. (For example, loading modules, new kernel images for kexec, and firmware.) There wasn t very good coverage for cases where the contents were coming from things that weren t files. To deal with this, new hooks were added that allow the LSMs to introspect the contents directly, and to do partial reads. This will give the LSMs much finer grain visibility into these kinds of operations. set_fs removal continues
With the earlier work landed to free the core kernel code from set_fs(), Christoph Hellwig made it possible for set_fs() to be optional for an architecture. Subsequently, he then removed set_fs() entirely for x86, riscv, and powerpc. These architectures will now be free from the entire class of kernel address limit attacks that only needed to corrupt a single value in struct thead_info. sysfs_emit() replaces sprintf() in /sys
Joe Perches tackled one of the most common bug classes with sprintf() and snprintf() in /sys handlers by creating a new helper, sysfs_emit(). This will handle the cases where kernel code was not correctly dealing with the length results from sprintf() calls, which might lead to buffer overflows in the PAGE_SIZE buffer that /sys handlers operate on. With the helper in place, it was possible to start the refactoring of the many sprintf() callers. nosymfollow mount option
Mattias Nissler and Ross Zwisler implemented the nosymfollow mount option. This entirely disables symlink resolution for the given filesystem, similar to other mount options where noexec disallows execve(), nosuid disallows setid bits, and nodev disallows device files. Quoting the patch, it is useful as a defensive measure for systems that need to deal with untrusted file systems in privileged contexts. (i.e. for when /proc/sys/fs/protected_symlinks isn t a big enough hammer.) Chrome OS uses this option for its stateful filesystem, as symlink traversal as been a common attack-persistence vector. ARMv8.5 Memory Tagging Extension support
Vincenzo Frascino added support to arm64 for the coming Memory Tagging Extension, which will be available for ARMv8.5 and later chips. It provides 4 bits of tags (covering multiples of 16 byte spans of the address space). This is enough to deterministically eliminate all linear heap buffer overflow flaws (1 tag for free , and then rotate even values and odd values for neighboring allocations), which is probably one of the most common bugs being currently exploited. It also makes use-after-free and over/under indexing much more difficult for attackers (but still possible if the target s tag bits can be exposed). Maybe some day we can switch to 128 bit virtual memory addresses and have fully versioned allocations. But for now, 16 tag values is better than none, though we do still need to wait for anyone to actually be shipping ARMv8.5 hardware. fixes for flaws found by UBSAN
The work to make UBSAN generally usable under syzkaller continues to bear fruit, with various fixes all over the kernel for stuff like shift-out-of-bounds, divide-by-zero, and integer overflow. Seeing these kinds of patches land reinforces the the rationale of shifting the burden of these kinds of checks to the toolchain: these run-time bugs continue to pop up. flexible array conversions
The work on flexible array conversions continues. Gustavo A. R. Silva and others continued to grind on the conversions, getting the kernel ever closer to being able to enable the -Warray-bounds compiler flag and clear the path for saner bounds checking of array indexes and memcpy() usage. That s it for now! Please let me know if you think anything else needs some attention. Next up is Linux v5.11.

2022, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.
CC BY-SA 4.0

3 March 2022

Enrico Zini: Migrating from procmail to sieve

Anarcat's "procmail considered harmful" post convinced me to get my act together and finally migrate my venerable procmail based setup to sieve. My setup was nontrivial, so I migrated with an intermediate step in which sieve scripts would by default pipe everything to procmail, which allowed me to slowly move rules from procmailrc to sieve until nothing remained in procmailrc. Here's what I did. Literature review https://brokkr.net/2019/10/31/lets-do-dovecot-slowly-and-properly-part-3-lmtp/ has a guide quite aligned with current Debian, and could be a starting point to get an idea of the work to do. https://wiki.dovecot.org/HowTo/PostfixDovecotLMTP is way more terse, but more aligned with my intentions. Reading the former helped me in understanding the latter. https://datatracker.ietf.org/doc/html/rfc5228 has the full Sieve syntax. https://doc.dovecot.org/configuration_manual/sieve/pigeonhole_sieve_interpreter/ has the list of Sieve features supported by Dovecot. https://doc.dovecot.org/settings/pigeonhole/ has the reference on Dovecot's sieve implementation. https://raw.githubusercontent.com/dovecot/pigeonhole/master/doc/rfc/spec-bosch-sieve-extprograms.txt is the hard to find full reference for the functions introduced by the extprograms plugin. Debugging tools:
  • doveconf to dump dovecot's configuration to see if what it understands matches what I mean
  • sieve-test parses sieve scripts: sieve-test file.sieve /dev/null is a quick and dirty syntax check
Backup of all mails processed One thing I did with procmail was to generate a monthly mailbox with all incoming email, with something like this:
BACKUP="/srv/backupts/test- date +%Y-%m-d .mbox"
:0c
$BACKUP
I did not find an obvious way in sieve to create montly mailboxes, so I redesigned that system using Postfix's always_bcc feature, piping everything to an archive user. I'll then recreate the monthly archiving using a chewmail script that I can simply run via cron. Configure dovecot
apt install dovecot-sieve dovecot-lmtpd
I added this to the local dovecot configuration:
service lmtp  
  unix_listener /var/spool/postfix/private/dovecot-lmtp  
    user = postfix
    group = postfix
    mode = 0666
   
 
protocol lmtp  
  mail_plugins = $mail_plugins sieve
 
plugin  
  sieve = file:~/.sieve;active=~/.dovecot.sieve
 
This makes Dovecot ready to receive mail from Postfix via a lmtp unix socket created in Postfix's private chroot. It also activates the sieve plugin, and uses ~/.sieve as a sieve script. The script can be a file or a directory; if it is a directory, ~/.dovecot.sieve will be a symlink pointing to the .sieve file to run. This is a feature I'm not yet using, but if one day I want to try enabling UIs to edit sieve scripts, that part is ready. Delegate to procmail To make sieve scripts that delegate to procmail, I enabled the sieve_extprograms plugin:
 plugin  
   sieve = file:~/.sieve;active=~/.dovecot.sieve
+  sieve_plugins = sieve_extprograms
+  sieve_extensions +vnd.dovecot.pipe
+  sieve_pipe_bin_dir = /usr/local/lib/dovecot/sieve-pipe
+  sieve_trace_dir = ~/.sieve-trace
+  sieve_trace_level = matching
+  sieve_trace_debug = yes
  
and then created a script for it:
mkdir -p /usr/local/lib/dovecot/sieve-pipe/
(echo "#!/bin/sh'; echo "exec /usr/bin/procmail") > /usr/local/lib/dovecot/sieve-pipe/procmail
chmod 0755 /usr/local/lib/dovecot/sieve-pipe/procmail
And I can have a sieve script that delegates processing to procmail:
require "vnd.dovecot.pipe";
pipe "procmail";
Activate the postfix side These changes switched local delivery over to Dovecot:
--- a/roles/mailserver/templates/dovecot.conf
+++ b/roles/mailserver/templates/dovecot.conf
@@ -25,6 +25,8 @@
 
+auth_username_format = %Ln
+
 
diff --git a/roles/mailserver/templates/main.cf b/roles/mailserver/templates/main.cf
index d2c515a..d35537c 100644
--- a/roles/mailserver/templates/main.cf
+++ b/roles/mailserver/templates/main.cf
@@ -64,8 +64,7 @@ virtual_alias_domains =
 
-mailbox_command = procmail -a "$EXTENSION"
-mailbox_size_limit = 0
+mailbox_transport = lmtp:unix:private/dovecot-lmtp
 
Without auth_username_format = %Ln dovecot won't be able to understand usernames sent by postfix in my specific setup. Moving rules over to sieve This is mostly straightforward, with the luxury of being able to do it a bit at a time. The last tricky bit was how to call spamc from sieve, as in some situations I reduce system load by running the spamfilter only on a prefiltered selection of incoming emails. For this I enabled the filter directive in sieve:
 plugin  
   sieve = file:~/.sieve;active=~/.dovecot.sieve
   sieve_plugins = sieve_extprograms
-  sieve_extensions +vnd.dovecot.pipe
+  sieve_extensions +vnd.dovecot.pipe +vnd.dovecot.filter
   sieve_pipe_bin_dir = /usr/local/lib/dovecot/sieve-pipe
+  sieve_filter_bin_dir = /usr/local/lib/dovecot/sieve-filter
   sieve_trace_dir = ~/.sieve-trace
   sieve_trace_level = matching
   sieve_trace_debug = yes
  
Then I created a filter script:
mkdir -p /usr/local/lib/dovecot/sieve-filter/"
(echo "#!/bin/sh'; echo "exec /usr/bin/spamc") > /usr/local/lib/dovecot/sieve-filter/spamc
chmod 0755 /usr/local/lib/dovecot/sieve-filter/spamc
And now what was previously:
:0 fw
  /usr/bin/spamc
:0
* ^X-Spam-Status: Yes
.spam/
Can become:
require "vnd.dovecot.filter";
require "fileinto";
filter "spamc";
if header :contains "x-spam-level" "**************"  
    discard;
  elsif header :matches "X-Spam-Status" "Yes,*"  
    fileinto "spam";
 
Updates Ansgar mentioned that it's possible to replicate the monthly mailbox using the variables and date extensions, with a hacky trick from the extensions' RFC:
require "date"
require "variables"
if currentdate :matches "month" "*"   set "month" "$ 1 ";  
if currentdate :matches "year" "*"   set "year" "$ 1 ";  
fileinto :create "$ month -$ year ";

23 February 2022

Jonathan McDowell: Upgrading my home internet; a story of yak shaving

RB5009 This has ended up longer than I expected. I ll write up posts about some of the individual steps with some more details at some point, but this is an overview of the yak shaving I engaged in. The TL;DR is:
  • I wanted to upgrade my internet connection, but:
  • My router wasn t fast enough, so:
  • I bought a new one and:
  • Proceeded to help work on mainline Linux support, and:
  • Did some tweaking of my Debian setup to allow for a squashfs root, and:
  • Upgraded it to Debian 11 (bullseye) in the process, except:
  • It turned out my home automation devices weren t happy, so:
  • I dug into some memory issues on my ESP8266 firmware, which:
  • Led to diagnosing some TLS interaction issues with the firmware, and:
  • I had an interlude into some interrupt affinity issues, but:
  • I finally got there.

The desire for a faster connection When I migrated my home connection to FTTP I kept the same 80M/20M profile I d had on FTTC. I didn t have a pressing need for faster, and I saved money because I was no longer paying for the phone line portion. I wanted more, but at the time I think the only option was for a 160M/30M profile instead and I didn t need it and it wasn t enough better to convince me. Time passed and BT rolled out their GigE (really 900M) download option. And again, I didn t need it, but I wanted it. My provider, Aquiss, initially didn t offer this (I think they had up to 330M download options available by this point). So I stayed on 80M/20M. And the only time I really wanted it to be faster was when pushing off-site backups to rsync.net. Of course, we ve had the pandemic, and that s involved 2 adults working from home with plenty of video calls throughout the day. The 80M/20M connection has proved rock solid for this, so again, I didn t feel an upgrade was justified. We got a 4K capable TV last year and while the bandwidth usage for 4K streaming is noticeably higher, again the connection can handle it no problem. At some point last year I noticed Aquiss had added speed options all the way to 900M down. At the end of the year I accepted a new role, which is fully remote, so I had a bit of an acceptance about the fact that I wasn t going back into an office any time soon. The combination (and the desire for the increased upload speed) finally allowed me to justify the upgrade to myself.

Testing the current setup for bottlenecks The first thing to do was see whether my internal network could cope with an upgrade. I m mostly running Cat6 GigE so I wasn t worried about that side of things. However I m using an RB3011 as my core router, and while it has some coprocessors for routing acceleration they re not supported under mainline Linux (and unlikely to be any time soon). So I had to benchmark what it was capable of routing. I run a handful of VLANs within my home network, with stateful firewalling between them, so I felt that would be a good approximation of the maximum speed to the outside world I might be able to get if I had the external connection upgraded. I went for the easy approach and fired up iPerf3 on 2 hosts, both connected via ethernet but on separate networks, so routed through the RB3011. That resulted in slightly more than a 300Mb/s throughput. Ok. I confirmed that I could get 900Mb/s+ on 2 hosts both on the same network, just to be sure there wasn t some other issue I was missing. Nope, so unsurprisingly the router was the bottleneck. So. To upgrade my internet speed I need to upgrade my router. I could just buy something off the shelf, but I like being able to run Debian (or OpenWRT) on the router rather than some horrible vendor firmware. Lucky MikroTik launched the RB5009 towards the end of last year. RouterOS is probably more than capable, but what really interested me was the fact it s an ARM64 platform based on an Armada 7040, which is pretty well supported in mainline kernels already. There s a 10G connection from the internal switch to the CPU, as well as a 2.5Gb/s ethernet port and a 10G SFP+ cage. All good stuff. I ordered one just before the New Year. Thankfully the OpenWRT folk had done all of the hard work on getting a mainline kernel booting on the device; Sergey Sergeev and Robert Marko in particular fighting RouterBoot and producing a suitable device tree file to get everything up and running. I ended up soldering a serial console connection up to aid debugging, and lightly patching Rob s u-boot to fix the incorrect RAM size reported by RouterBoot. A few kernel tweaks were necessary to make the networking entirely happy and at that point it was time to think about actually doing a replacement.

Upgrading to Debian 11 (bullseye) My RB3011 is currently running Debian 10 (buster); an upgrade has been on my todo list, but with the impending replacement I decided I d hold off and create a new Debian 11 (bullseye) image for the RB5009. Additionally, I don t actually run off the internal NAND in the RB3011; I have a USB flash drive for the rootfs and just the kernel booting off internal NAND. Originally this was for ease of testing, then a combination of needing to figure out a good read-only root solution and a small enough image to fit in the 120M available. For the upgrade I decided to finally look at these pieces. I ve ended up with a script that will build me a squashfs image, and the initial rootfs takes care of mounting this and then a tmpfs as an overlay fs. That means I can easily see what pieces are being written to. The RB5009 has a total of 1G NAND so I m not as space constrained, but the squashfs ends up under 50M. I ve added some additional pieces to allow me to pre-populate the overlay fs with updates rather than always needing to rebuild the squashfs image. With that done I decided to try it out on the RB3011; I tweaked the build script to be able to build for armhf (the RB3011) or arm64 (the RB5009) and to deal with some slight differences in configuration between the two (e.g. interface naming). The idea here was to ensure I d got all the appropriate configuration sorted for the RB5009, in the known-good existing environment. Everything is still on a USB stick at this stage and the new device has an armhf busybox root meaning it can be used on either device, and the init script detects the architecture to select the appropriate squashfs to mount.

A problem with ESP8266 home automation devices Everything seemed to work fine - a few niggles with the watchdog, which is overly sensitive on the RB3011, but I got those sorted (and the build script updated) and the device came up and successfully did the PPPoE dance to bring up external connectivity. And then I noticed that my home automation devices were having problems connecting to the mosquitto MQTT server. It turned out it was only the ESP8266 based devices that were failing, and examining the serial debug output on one of my test devices revealed it was hitting an out of memory issue (displaying E:M 280) when establishing the TLS MQTT connection. I rolled back to the Debian 10 image and set about creating a test environment to look at the ESP8266 issues. My first action was to try and reduce my RAM footprint to try and ensure there was enough spare to establish the connection. I moved a few functions that were still sitting in IRAM into flash. I cleaned up a couple of buffers that are on the stack to be more correctly sized. I tried my new image, and I didn t get the memory issue. Instead I progressed a bit further and got a watchdog reset. Doh! It was obviously something related to the TLS connection, but I couldn t easily see what the difference was; the same x509 cert was in use, it looked like the initial handshake was the same (and trying with openssl s_client looked pretty similar too). I set about instrumenting the ancient Mbed TLS used in the Espressif SDK and discovered that whatever had changed between buster + bullseye meant the EPS8266 was now trying a TLS-DHE-RSA-WITH-AES-256-CBC-SHA256 handshake instead of a TLS-RSA-WITH-AES-256-CBC-SHA256 handshake and that was causing enough extra CPU usage that it couldn t complete in time and the watchdog kicked in. So I commented out MBEDTLS_KEY_EXCHANGE_DHE_RSA_ENABLED in the config_esp.h for mbedtls and rebuilt things. Hacky, but I ll go back to trying to improve this generally at some point.

A detour into interrupt load Now, my testing of the RB3011 image is generally done at weekends, when I have enough time to tear down and rebuild the connection rather than doing it in the evening and having limited time to get things working again in time for work in the morning. So at the point I had an image ready to go I pulled the trigger on the line upgrade. I went with the 500M/75M option rather than the full 900M - I suspect I d have difficulty actually getting that most of the time and 75M of upload bandwidth seems fairly substantial for now. It only took a couple of days from the order to the point the line was regraded (which involved no real downtime - just a reconnection in the night). Of course this happened just after the weekend I d discovered the ESP8266 issue. collectd CPU usage for RB3011 This provided an opportunity to see just what the RB3011 could actually manage. In the configuration I had it turned out to be not much more than the 80Mb/s speeds I had previously seen. The upload jumped from a solid 20Mb/s to 75Mb/s, so I knew the regrade had actually happened. Looking at CPU utilisation clearly showed the problem; softirqs were using almost 100% of a CPU core. Now, the way the hardware is setup on the RB3011 is that there are two separate 5 port switches, each connected back to the CPU via a separate GigE interface. For various reasons I had everything on a single switch, which meant that all traffic was boomeranging in and out of the same CPU interface. The IPQ8064 has dual cores, so I thought I d try moving the external connection to the other switch. That puts it on its own GigE CPU interface, which then allows binding the interrupts to a different CPU core. That helps; throughput to the outside world hits 140Mb/s+. Still a long way from the expected max, but proof we just need more grunt.

Success collectd CPU usage for RB5009 Which brings us to this past weekend, when, having worked out all the other bits, I tried the squashfs root image again on the RB3011. Success! The home automation bits connected to it, the link to the outside world came up, everything seemed happy. So I double checked my bootloader bits on the RB5009, brought it down to the comms room and plugged it in instead. And, modulo my failing to update the nftables config to allow it to do forwarding, it all came up ok. Some testing with iperf3 internally got a nice 912Mb/s sustained between subnets, and some less scientific testing with wget + speedtest-cli saw speeds of over 460Mb/s to the outside world. Time from ordering the router until it was in service? Just under 8 weeks

27 January 2022

Russ Allbery: Review: I Didn't Do the Thing Today

Review: I Didn't Do the Thing Today, by Madeleine Dore
Publisher: Avery
Copyright: 2022
ISBN: 0-593-41914-6
Format: Kindle
Pages: 291
At least from my narrow view of it, the world of productivity self-help literature is a fascinating place right now. The pandemic overturned normal work patterns and exacerbated schedule inequality, creating vastly different experiences for the people whose work continued to be in-person and the people whose work could become mostly or entirely remote. Self-help literature, which is primarily aimed at the more affluent white-collar class, primarily tracked the latter disruption: newly-remote work, endless Zoom meetings, the impossibility of child care, the breakdown of boundaries between work and home, and the dawning realization that much of the mechanics of day-to-day office work are neither productive nor defensible. My primary exposure these days to the more traditional self-help productivity literature is via Cal Newport. The stereotype of the productivity self-help book is a collection of life hacks and list-making techniques that will help you become a more efficient capitalist cog, but Newport has been moving away from that dead end for as long as I've been reading him, and his recent work focuses more on structural issues with the organization of knowledge work. He also shares with the newer productivity writers a willingness to tell people to use the free time they recover via improved efficiency on some life goal other than improved job productivity. But he's still prickly and defensive about the importance of personal productivity and accomplishing things. He gives lip service on his podcast to the value of the critique of productivity, but then usually reverts to characterizing anti-productivity arguments as saying that productivity is a capitalist invention to control workers. (Someone has doubtless said this on Twitter, but I've never seen a serious critique of productivity make this simplistic of an argument.) On the anti-productivity side, as it's commonly called, I've seen a lot of new writing in the past couple of years that tries to break the connection between productivity and human worth so endemic to US society. This is not a new analysis; disabled writers have been making this point for decades, it's present in both Keynes and in Galbraith's The Affluent Society, and Kathi Weeks's The Problem with Work traces some of its history in Marxist thought. But what does feel new to me is its widespread mainstream appearance in newspaper articles, viral blog posts, and books such as Jenny Odell's How to Do Nothing and Devon Price's Laziness Does Not Exist. The pushback against defining life around productivity is having a moment. Entering this discussion is Madeleine Dore's I Didn't Do the Thing Today. Dore is the author of the Extraordinary Routines blog and host of the Routines and Ruts podcast. Extraordinary Routines began as a survey of how various people organize their daily lives. I Didn't Do the Thing Today is, according to the preface, a summary of the thoughts Dore has had about her own life and routines as a result of those interviews. As you might guess from the subtitle (Letting Go of Productivity Guilt), Dore's book is superficially on the anti-productivity side. Its chapters are organized around gentle critiques of productivity concepts, with titles like "The Hopeless Search for the Ideal Routine," "The Myth of Balance," or "The Harsh Rules of Discipline." But I think anti-productivity is a poor name for this critique; its writers are not opposed to being productive, only to its position as an all-consuming focus and guilt-generating measure of personal worth. Dore structures most chapters by naming an aspect, goal, or concern of a life defined by productivity, such as wasted time, ambition, busyness, distraction, comparison, or indecision. Each chapter sketches the impact of that idea and then attempts to gently dismantle the grip that it may have on the reader's life. All of these discussions are nuanced; it's rare for Dore to say that one of these aspects has no value, and she anticipates numerous objections. But her overarching goal is to help the reader be more comfortable with imperfection, more willing to live in the moment, and less frustrated with the limitations of life and the human brain. If striving for productivity is like lifting weights, Dore's diagnosis is that we've tried too hard for too long, and have overworked that muscle until it is cramping. This book is a gentle massage to induce the muscle to relax and let go. Whether this will work is, as with all self-help books, individual. I found it was best read in small quantities, perhaps a chapter per day, since it otherwise began feeling too much the same. I'm also not the ideal audience; Dore is a creative freelancer and primarily interviewed other creative people, which I think has a different sort of productivity rhythm than the work that I do. She's also not a planner to the degree that I am; more on that below. And yet, I found this book worked on me anyway. I can't say that I was captivated all the way through, but I found myself mentally relaxing while I was reading it, and I may re-read some chapters from time to time. How does this relate to the genre of productivity self-help? With less conflict than I think productivity writers believe, although there seems to be one foundational difference of perspective. Dore is not opposed to accomplishing things, or even to systems that help people accomplish things. She is more attuned than the typical productivity writer to the guilt and frustration that can accumulate when one has a day in which one does not do the thing, but her goal is not to talk you out of attempting things. It is, instead, to convince you to hold those attempts and goals more lightly, to allow them to move and shift and change, and to not treat a failure to do the thing today as a reason for guilt. This is wholly compatible with standard productivity advice. It's adding nuance at one level of abstraction higher: how tightly to cling to productivity goals, and what to do when they don't work out. Cramping muscles are not strong muscles capable of lifting heavy things. If one can massage out the cramp, one's productivity by even the strict economic definition may improve. Where I do see a conflict is that most productivity writers are planners, and Dore is not. This is, I think, a significant blind spot in productivity self-help writing. Cal Newport, for example, advocates time-block planning, where every hour of the working day has a job. David Allen advocates a complex set of comprehensive lists and well-defined next actions. Mark Forster builds a flurry of small systems for working through lists. The standard in productivity writing is to to add structure to your day and cultivate the self-discipline required to stick to that structure. For many people, including me, this largely works. I'm mostly a planner, and when my life gets chaotic, adding more structure and focusing on that structure helps me. But the productivity writers I've read are quite insistent that their style of structure will work for everyone, and on that point I am dubious. Newport, for example, advocates time-block planning for everyone without exception, insisting that it is the best way to structure a day. Dore, in contrast, describes spending years trying to perfect a routine before realizing that elastic possibilities work better for her than routines. For those who are more like Dore than Newport, I Didn't Do the Thing Today is more likely to be helpful than Newport's instructions. This doesn't make Newport's ideas wrong; it simply makes them not universal, something that the productivity self-help genre seems to have trouble acknowledging. Even for readers like myself who prefer structure, I Didn't Do the Thing Today is a valuable corrective to the emphasis on every-better systems. For those who never got along with too much structure, I think it may strike a chord. The standard self-help caveat still applies: Dore has the most to say to people who are in a similar social class and line of work as her. I'm not sure this book will be of much help to someone who has to juggle two jobs with shift work and child care, where the problem is more sharp external constraints than internalized productivity guilt. But for its target audience, I think it's a valuable, calming message. Dore doesn't have a recipe to sort out your life, but may help you feel better about the merits of life unsorted. Rating: 7 out of 10

16 January 2022

Russ Allbery: Review: The Brightest Fell

Review: The Brightest Fell, by Seanan McGuire
Series: October Daye #11
Publisher: DAW
Copyright: 2017
ISBN: 0-698-18352-5
Format: Kindle
Pages: 353
This is the eleventh book in the October Daye urban fantasy series, not counting various novellas and side stories. You really cannot start here, particularly given how many ties this book has to the rest of the series. I would like to claim there's some sort of plan or strategy in how I read long series, but there are just a lot of books to read and then I get distracted and three years have gone by. The advantage of those pauses, at least for writing reviews, is that I return to the series with fresh eyes and more points of comparison. My first thought this time around was "oh, these books aren't that well written, are they," followed shortly thereafter by staying up past midnight reading just one more chapter. Plot summaries are essentially impossible this deep into a series, when even the names of the involved characters can be a bit of a spoiler. What I can say is that we finally get the long-awaited confrontation between Toby and her mother, although it comes in an unexpected (and unsatisfying) form. This fills in a few of the gaps in Toby's childhood, although there's not much there we didn't already know. It fills in considerably more details about the rest of Toby's family, most notably her pure-blood sister. The writing is indeed not great. This series is showing some of the signs I've seen in other authors (Mercedes Lackey, for instance) who wrote too many books per year to do each of them justice. I have complained before about McGuire's tendency to reuse the same basic plot structure, and this instance seemed particularly egregious. The book opens with Toby enjoying herself and her found family, feeling like they can finally relax. Then something horrible happens to people she cares about, forcing her to go solve the problem. This in theory requires her to work out some sort of puzzle, but in practice is fairly linear and obvious because, although I love Toby as a character, she can't puzzle her way out of a wet sack. Everything is (mostly) fixed in the end, but there's a high cost to pay, and everyone ends the book with more trauma. The best books of this series are the ones where McGuire manages to break with this formula. This is not one of them. The plot is literally on magical rails, since The Brightest Fell skips even pretending that Toby is an actual detective (although it establishes that she's apparently still working as one in the human world, a detail that I find baffling) and gives her a plot compass that tells her where to go. I don't really mind this since I read this series for emotional catharsis rather than Toby's ingenuity, but alas that's mostly missing here as well. There is a resolution of sorts, but it's the partial and conditional kind that doesn't include awful people getting their just deserts. This is also not a good series entry for world-building. McGuire has apparently been dropping hints for this plot back at least as far as Ashes of Honor. I like that sort of long-term texture to series like this, but the unfortunate impact on this book is a lot of revisiting of previous settings and very little in the way of new world-building. The bit with the pixies was very good; I wanted more of that, not the trip to an Ashes of Honor setting to pick up a loose end, or yet another significant scene in Borderland Books. As an aside, I wish authors would not put real people into their books as characters, even when it's with permission as I'm sure it was here. It's understandable to write a prominent local business into a story as part of the local color (although even then I would rather it not be a significant setting in the story), but having the actual owner and staff show up, even in brief cameos, feels creepy and weird to me. It also comes with some serious risks because real people are not characters under the author's control. (All the content warnings for that link, which is a news story from three years after this book was published.) So, with all those complaints, why did I stay up late reading just one more chapter? Part of the answer is that McGuire writes very grabby books, at least for me. Toby is a full-speed-ahead character who is constantly making things happen, and although the writing in this book had more than the usual amount of throat-clearing and rehashing of the same internal monologue, the plot still moved along at a reasonable clip. Another part of the answer is that I am all-in on these characters: I like them, I want them to be happy, and I want to know what's going to happen next. It helps that McGuire has slowly added characters over the course of a long series and given most of them a chance to shine. It helps even more that I like all of them as people, and I like the style of banter that McGuire writes. Also, significant screen time for the Luidaeg is never a bad thing. I think this was the weakest entry in the series in a while. It wrapped up some loose ends that I wasn't that interested in wrapping up, introduced a new conflict that it doesn't resolve, spent a bunch of time with a highly unpleasant character I didn't enjoy reading about, didn't break much new world-building ground, and needed way more faerie court politics. But some of the banter was excellent, the pixies and the Luidaeg were great, and I still care a lot about these characters. I am definitely still reading. Followed by Nights and Silences. Continuing a pattern from Once Broken Faith, the ebook version of The Brightest Fell includes a bonus novella. (I'm not sure if it's also present in the print version.) "Of Things Unknown": As is usual for the short fiction in this series, this is a side story from the perspective of someone other than Toby. In this case, that's April O'Leary, first introduced all the way back in A Local Habitation, and the novella focuses on loose ends from that novel. Loose ends are apparently the theme of this book. This was... fine. I like April, I enjoyed reading a story from her perspective, and I'm always curious to see how Toby looks from the outside. I thought the plot was strained and the resolution a bit too easy and painless, and I was not entirely convinced by April's internal thought processes. It felt like McGuire left some potential for greater plot complications on the table here, and I found it hard to shake the impression that this story was patching an error that McGuire felt she'd made in the much earlier novel. But it was nice to have an unambiguously happy ending after the more conditional ending of the main story. (6) Rating: 6 out of 10

3 January 2022

Paul Wise: FLOSS Activities December 2021

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Review
  • Spam: reported 166 Debian mailing list posts
  • Patches: reviewed libpst upstream patches
  • Debian packages: sponsored nsis, memtest86+
  • Debian wiki: RecentChanges for the month
  • Debian BTS usertags: changes for the month
  • Debian screenshots:

Administration
  • libpst: setup GitHub presence, migrate from hg to git, requested details from bug reporters
  • plac: cleaned up git repo anomalies
  • Debian BTS: unarchive/reopen/triage bugs for reintroduced packages: stardict, node-carto
  • Debian wiki: unblock IP addresses, approve accounts

Communication
  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors The purple-discord, python-plac, sptag, smart-open, libpst, memtest86+, oci-python-sdk work was sponsored. All other work was done on a volunteer basis.

28 December 2021

Russ Allbery: Review: Out of Office

Review: Out of Office, by Charlie Warzel & Anne Helen Petersen
Publisher: Alfred A. Knopf
Copyright: 2021
ISBN: 0-593-32010-7
Format: Kindle
Pages: 260
Out of Office opens with the provocative assertion that you were not working from home during the pandemic, even if you were among the 42% of Americans who were able to work remotely.
You were, quite literally, doing your job from home. But you weren't working from home. You were laboring in confinement and under duress. Others have described it as living at work. You were frantically tapping out an email while trying to make lunch and supervise distance learning. You were stuck alone in a cramped apartment for weeks, unable to see friends or family, exhausted, and managing a level of stress you didn't know was possible. Work became life, and life became work. You weren't thriving. You were surviving.
The stated goal of this book is to reclaim the concept of working from home, not only from the pandemic, but also from the boundary-destroying metastasis of work into non-work life. It does work towards that goal, but the description of what would be required for working from home to live up to its promise becomes a sweeping critique of the organization and conception of work, leaving it nearly as applicable to those who continue working from an office. Turns out that the main problem with working from home is the work part, not the "from home" part. This was a fascinating book to read in conjunction with A World Without Email. Warzel and Petersen do the the structural and political analysis that I sometimes wish Newport would do more of, but as a result offer less concrete advice. Both, however, have similar diagnoses of the core problems of the sort of modern office work that could be done from home: it's poorly organized, poorly managed, and desperately inefficient. Rather than attempting to fix those problems, which is difficult, structural, and requires thought and institutional cooperation, we're compensating by working more. This both doesn't work and isn't sustainable. Newport has a background in productivity books and a love of systems and protocols, so his focus in A World Without Email is on building better systems of communication and organization of work. Warzel and Petersen come from a background of reporting and cultural critique, so they put more focus on power imbalances and power-serving myths about the American dream. Where Newport sees an easy-to-deploy ad hoc work style that isn't fit for purpose, Warzel and Petersen are more willing to point out intentional exploitation of workers in the guise of flexibility. But they arrive at some similar conclusions. The way office work is organized is not leading to more productivity. Tools like Slack encourage the public performance of apparent productivity at the cost of the attention and focus required to do meaningful work. And the process is making us miserable. Out of Office is, in part, a discussion of what would be required to do better work with less stress, but it also shares a goal with Newport and some (but not most) corners of productivity writing: spend less time and energy on work. The goal of Out of Office is not to get more work done. It's to work more efficiently and sustainably and thus work less. To reclaim the promise of flexibility so that it benefits the employee and not the employer. To recognize, in the authors' words, that the office can be a bully, locking people in to commute schedules and unnatural work patterns, although it also provides valuable moments of spontaneous human connection. Out of Office tries to envision a style of work that includes the office sometimes, home sometimes, time during the day to attend to personal chores or simply to take a mental break from an unnatural eight hours (or more) of continuous focus, universal design, real worker-centric flexibility, and an end to the constant productivity ratchet where faster work simply means more work for the same pay. That's a lot of topics for a short book, and structurally this is a grab bag. Some sections will land and some won't. Loom's video messages sound like a nightmare to me, and I rolled my eyes heavily at the VR boosterism, reluctant as it may be. The section on DEI (diversity, equity, and inclusion) was a valiant effort that at least gestures towards the dismal track record of most such efforts, but still left me unconvinced that anyone knows how to improve diversity in an existing organization without far more brute-force approaches than anyone with power is usually willing to consider. But there's enough here, and the authors move through topics quickly enough, that a section that isn't working for you will soon be over. And some of the sections that do work are great. For example, the whole discussion of management.
Many of these companies view middle management as bloat, waste, what David Graeber would call a "bullshit job." But that's because bad management is a waste; you're paying someone more money to essentially annoy everyone around them. And the more people experience that sort of bad management, and think of it as "just the way it is," the less they're going to value management in general.
I admit to a lot of confirmation bias here, since I've been ranting about this for years, but management must be the most wide-spread professional job for which we ignore both training and capability and assume that anyone who can do any type of useful work can also manage people doing that work. It's simply not true, it creates workplaces full of horrible management, and that in turn creates a deep and unhelpful cynicism about all management. There is still a tendency on the left to frame this problem in terms of class struggle, on the reasonable grounds that for decades under "scientific management" of manufacturing that's what it was. Managers were there to overwork workers and extract more profits for the owners, and labor unions were there to fight back against managers. But while some of this does happen in the sort of office work this book is focused on, I think Warzel and Petersen correctly point to a different cause.
"The reason she was underpaid on the team was not because her boss was cackling in the corner. It was because nobody told the boss it was their responsibility to look at the fucking spreadsheet."
We don't train managers, we have no clear expectations for what managers should do, we don't meaningfully measure their performance, we accept a high-overhead and high-chaos workstyle based on ad hoc one-to-one communication that de-emphasizes management, and many managers have never seen good management and therefore have no idea what they're supposed to be doing. The management problem for many office workers is less malicious management than incompetent management, or simply no effective management at all apart from an occasional reorg and a complicated and mind-numbing annual review form. The last section of this book (apart from concluding letters to bosses and workers) is on community, and more specifically on extracting time and energy from work (via the roadmap in previous chapters) and instead investing it in the people around you. Much ink has been spilled about the collapse of American civic life, about how we went from a nation of joiners to a nation of isolated individual workers with weak and failing community institutions. Warzel and Petersen correctly lay some blame for this at the foot of work, and see the reorganization of work and an increase in work from home (and thus a decrease in commutes) as an opportunity to reverse that trend. David Brooks recently filled in for Ezra Klein on his podcast and talked with University of Chicago professor Leon Kass, which I listened to shortly after reading this book. In one segment, they talked about marriage and complained about the decline in marriage rates. They were looking for causes in people's moral upbringing, in their life priorities, in the lack of aspiration for permanence in kids these days, and in any other personal or moral failing that would allow them to be smugly judgmental. It was a truly remarkable thing to witness. Neither man at any point in the conversation mentioned either money or time. Back in the world most Americans live in, real wages have been stagnant for decades, student loan debt is skyrocketing as people desperately try to keep up with the ever-shifting requirements for a halfway-decent job, and work has expanded to fill all hours of the day, even for people who don't have to work multiple jobs to make ends meet. Employers have fully embraced a "flexible" workforce via layoffs, micro-optimizing work scheduling, eliminating benefits, relying on contract and gig labor, and embracing exceptional levels of employee turnover. The American worker has far less of money, time, and stability, three important foundations for marriage and family as well as participation in most other civic institutions. People like Brooks and Kass stubbornly cling to their feelings of moral superiority instead of seeing a resource crisis. Work has stolen the resources that people previously put into those other areas of their life. And it's not even using those resources effectively. That's, in a way, a restatement of the topic of this book. Our current way of organizing work is not sustainable, healthy, or wise. Working from home may be part of a strategy for changing it. The pandemic has already heavily disrupted work, and some of those changes, including increased working from home, seem likely to stick. That provides a narrow opportunity to renegotiate our arrangement with work and try to make those changes stick. I largely agree with the analysis, but I'm pessimistic. I think the authors are as well. We're very bad at social change, and there will be immense pressure for everything to go "back to normal." Those in the best bargaining position to renegotiate work for themselves are not in the habit of sharing that renegotiation with anyone else. But I'm somewhat heartened by how much public discussion there currently is about a more fundamental renegotiation of the rules of office work. I'm also reminded of a deceptively profound aphorism from economist Herbert Stein: "If something cannot go on forever, it will stop." This book is a bit uneven and is more of a collection of related thoughts than a cohesive argument, but if you are hungry for more worker-centric analyses of the dynamics of office work (inside or outside the office), I think it's worth reading. Rating: 7 out of 10

26 December 2021

Vincent Bernat: Custom screen saver with XSecureLock

i3lock is a popular X11 screen lock utility. As far as customization goes, it only allows one to set a background from a PNG file. This limitation is part of the design of i3lock: its primary goal is to keep the screen locked, something difficult enough with X11. Each additional feature would increase the attack surface and move away from this goal.1 Many are frustrated with these limitations and extend i3lock through simple wrapper scripts or by forking it.2 The first solution is usually safe, but the second goes against the spirit of i3lock. XSecureLock is a less-known alternative to i3lock. One of the most attractive features of this locker is to delegate the screen saver feature to another process. This process can be anything as long it can attach to an existing window provided by XSecureLock, which won t pass any input to it. It will also put a black window below it to ensure the screen stays locked in case of a crash. XSecureLock is shipped with a few screen savers, notably one using mpv to display photos or videos, like the Apple TV aerial videos. I have written my own saver using Python and GTK.3 It shows a background image, a clock, and the current weather.4
Custom screen saver for XSecureLock, displaying a clock and the current weather
Custom screen saver for XSecureLock
I add two patches over XSecureLock:
  • Sleep before mapping screen saver window. This patch prevents a flash of black when starting XSecureLock by waiting a bit for the screen saver to be ready before displaying it. As I am also using a custom dimmer fading to the expected background before locking, the flash of black was quite annoying for me. I have good hope this patch will be accepted upstream.
  • Do not mess with DPMS/blanking. This patch prevents XSecureLock from blanking the screen. I think this is solely the role of the X11 DPMS extension. This makes the code simpler. I am unsure if this patch would be accepted by upstream.
XSecureLock also delegates the authentication window to another process, but I was less comfortable providing a custom one as it is a bit more security-sensitive. While basic, the shipped authentication application is fine by me. I think people should avoid modifying i3lock code and use XSecureLock instead. I hope this post will help a bit.

Update (2022-01) XScreenSaver can also run arbitrary programs as a screen saver.


  1. See for example this comment or this one explaining the rationale.
  2. This Reddit post enumerates many of these alternatives.
  3. Using GTK makes it a bit difficult to use some low-level features, like embedding an application into an existing window. However, the high-level features are easier, notably drawing an image and a text with a shadow.
  4. Weather is retrieved by another script running on a timer and written to a file. The screen saver watches this file for updates.

Next.

Previous.